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fK| (54) Title: COISOGENIC EUKARYOTIC CELL COLLECTIONS 

^ (57) Abstract: Collections of cultured eukaryotic cells, particularly human cells, in which the cells are coisogenic at a common 
^ target locus, are provided. Particularly provided are collections of coisogenic cells that dififer in genomic sequence by no more 
2 than 0.05%, excluding changes at the target locus, collections in which the coisogenic cells differ in genomic sequence by no more 
than 0.005%, excluding changes at the target locus, and collections in which the cells jack heterologous genetic elements within 10 
kilobases of the coisogenic taiget locus. Kits comprising the cell collections, methods of making the collections, kits for making 
the collections, and methods of using the collections to facilitate pharmacogenomic analyses are presented. Preferred target loci at 
which the cells are coisogenic include genes that affect drug resistance, drug sensitivity, and/or drug metabolism. 
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Coisogenic Eukaryotic Cell Collections 



CROSS-REFERENCE TO RELATED APPLICATIONS 

This application claims the benefit of U.S. provisional application serial no. 
60/325,992, filed September 27, 2001, the disclosure of which is incorporated herein by 
5 reference in its entirety. 

FIELD OF THE INVENTION 

The present invention is in the field of molecular biology, and relates to 
coisogenic eukaryotic cell collections and methods of use therefor. More specifically, the 
invention relates to collections of eukaryotic cells that have been engineered to differ from 
1 0 one another by as few as one encoded amino acid at a defined target locus, particularly, but 
not exclusiv(5ly, target loci that encode proteins that affect responsiveness to therapeutfc 
agents, and to pharmacogenomic methods based thereupon. 
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BACKGROUND OF THE INVENTION 

The newly-emerging field of pharmacogenomics is premised on the notion 
that statistical coaelations of genotypic variations that occur naturally within a population 
(allelic variation) with their respective phenotypes can be used to predict an individual 
patient's responsiveness to therapy based upon knowledge of the patient's genotype; the 
ultimate goal is to stratify patient populations into genetic cohorts for which therapy can be 
separately tailored. See, e.g., Adam ef a/., "Pharmacogenomics to predict drug response," 
Pharmacogenomics 1(1):5-14 (2000); Judson ef a/., "The predictive power of haplotypes in 
clinical response," Phamnacogenomics 1(1):15-26 (2000). 

As a preliminary to any such clinical prognostication, naturally occumng 
alleles must be identified and the alleles correlated with observable clinical phenotypes. A 
sufficient number of individuals must be studied for the correlations to achieve statistical 
reliability. Each of these requirements limits the utility of current pharmacogenomic 
approaches. 

Although the first of these limitations is being addressed, in part, by public, 
quasi-public and private undertakings to identify all common single nucleotide 
polymorphisms (SNPs) in the human genome (see, e.g., NCBI's dbSNP database at 
http://www,ncbi.nlm.nih.gov/SNP/; the Karolinska Institute's Human Genie Bi-Allelic 
Sequences Database at http://hgbase.cgr.ki.se/; and the SNP Consortium's database at 
http://snp.cshl.org/), patients carrying uncommon, perhaps unique, alleles will remain 
outside the prognostic scope of such analyses. Furthemiore, the requirement for 
observable clinical phenotypes and the requirement for patient populations of adequate 
statistical size are not addressed by the simple expedient of cataloguing common SNPs. 

One clinical phenotype that has been proposed for pharmacogenomlc- 
based prognostication is multidrug resistance. See, e.g., Kerb et a/., "ABC dmg 
transporters: hereditary polymorphisms and phamiacological impact in MDR1, MRP1 and 
MRP2," Pharmacogenomics 2(1):51-64 (2000); Szakacs et aL, "Diagnostics of multidrug 
resistance in cancer," Pathol. OncoL Res. 4(4):251-7 (1998). 

Genetic polymorphisms in proteins other than the multidrug transporters 
are also known to play a role in drug sensitivity and in drug resistance. For example, the 
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cytochrome 450 enzyme encoded by CYP2D6 is known to metabolize as many as 20% of 
commonly prescribed drugs. The gene is highly polymorphic in the population; certain 
alleles result in the poor metabolizer phenotype, characterized by a decreased ability to 
metabolize the enzyme's substrates. 
5 In vitro assays have been developed to assess the drug sensitivity of 

individual cells. For example, U.S. Patent Nos. 6,277,655 and 5.872,014 describe assays 
specific for activity of the multidrug transporter ABCB1 (MDR1), as does Ludescher ef a/., 
Br. J. Haematol 82{1):161-8 (1992). See a/so, "In vitro assays for chemotherapy 
sensitivity," Crit Rev, Onco/. Hmatol. 15(2):99-111 (1993); Cree ef a/., "Tumor 

1 0 chemosensitivity and chemoresistance assays," Cancer 78(9):2031 -2 (1 996); Apoptosis and 
Cell Proliferation, 2"^* ed., Boehringer Mannheim, 1998 (available on-line at 
http://biochem.boehringer-mannheim,com/prodJnf/manuals/celLman/acp.pdf), and Poirier 
(ed.), Apoptosis Techniques and Protocols. Humana Press. 1997 (ISBN: 0896034518). 
Although the m vitro drug resistance (equally and conversely, drug 

1 5 sensitivity) phenotype of individual cells can at times predict the clinical phenotype of the 
entire organism, to apply such in vitro assays to phamiacogenomic analyses requires the in 
vitro assay of cells bearing different alleles of the gene or genes of interest. Few such 
alleles are available In cell lines that can readily be assayed, and when available, are often 
present on genetically disparate backgrounds. 

2 0 Recently, there have been efforts to create collections of cell lines that 

have defined genetic modifications on a unifonn genetic background for use in various in 
vitro assays. 

Genetic modifications that have typically been contemplated for eukaryotic 
cells used in screening assays Include targeted deletion or disruption of genes, dominant 
25 negative suppression of gene expression, and change in gene copy number. See, e.g., 
U.S. Patent Nos. 5,569,588. 5.777,888, 6,165,709, 6,046,002. For the most part, the 
preferred organism for such genetic modification has been yeast, notably Saccharomyces 
cerev/s/ae, due in part to its ability to support homologous recombination at efficiencies far 
greater than those possible in mammalian cells. Where the cell line is mammalian, 

3 0 however, often the chosen modification leaves heterologous nucleic acids at or near the 
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target locus, a legacy of virally-mediated modification events. See, e.g., U.S. Patent No. 
6,207,371. 

Thus, there exists a need for methods that would more readily permit 
pharmacogenomic analyses without requiring the prior large scale correlation of naturally- 
5 occumng alleles with naturally-occun-ing, clinically observable phenotypes. There is a 
further need in the art for collections of eukaryotic cells, particularly mammalian cells, that 
have defined mutations in target loci, particularly mutations that recapitulate naturally- 
occurring alleles, on a unifonn genetic background. There is a particular need for 
collections of eukaryotic cells that lack heterologous nucleic acid insertions additional to the 
10 targeted changes. In particular, there exists a need for such cell collections having targeted 
mutations in genes that affect daig resistance. 

SUMMARY OF THE INVENTION 

The present invention satisfies these and other objects in the art by 
providing, in a first aspect a collection of cultured cells, comprising at least 5, 10, or at least 

15 25 genotypically distinct cells, wherein each of the genotypically distinct cells is coisogenic 
with respect to the others in the collection at a common target locus. The genotypically 
distinct cells of the collection are separately assayable. 

As used herein, two genotypically distinct cells are "coisogenic" with 
respect to one another if derived from a common ancestor cell and engineered to differ from 

2 0 one another in genomic sequence at a predetemnined target locus. The genomic sequence 
differences at the target locus must be sufficient to alter the amino acid sequence encoded 
at the target locus by at least one amino acid. The tenn "coisogenic" permits of changes as 
between the genomes of the genotypically distinct cells additional to the changes at the 
target locus. 

25 In certain preferred embodiments, the coisogenic cells of the collection are 

"exceptionally coisogenic", that is, differ in genomic sequence by no more than 0.05%. 
excluding changes at the target locus, or "perfectly coisogenic", differing in genomic 
sequence by no more than 0.005%, excluding changes at the target locus. In certain 
preferred embodiments, the cells are alternatively, or additionally, legacy-free, that Is, 
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lacking in lieterologous genetic elements within 10 kilobases of any codon of the target 
locus. 

The coisogenic cells can be from any eukaryote; although usefully 
mammalian, especially human, the cells can also be of yeast or plant origin. 
5 In certain embodiments, the genotypically distinct cells of the collection 

collectively include each of the 20 natural amino acids at a single residue encoded at the 
target locus, in other embodiments, the genotypically distinct cells collectively include a 
predetemiined amino acid at each residue encoded after the initiator methionine at the 
target locus. In particularly preferred embodiments, the genotypically distinct cells 
10 collectively include at least one, and on occasion a plurality, of naturally occurring allele of 
the target locus. 

The cells of the collection can further comprise a common selectable 
marker at a genomic locus different from said target locus, and/or a marker unique to said 
genotypically distinct cell, the unique marker being at a locus different from the target locus. 

15 The target locus can be any locus of interest, and in particularly useful 

embodiments, is selected from the group of loci affecting drug resistance (sensitivity) or 
drug metabolism consisting of: CYP1A2, CYP2C17, CYP2D6, CYP2E, CYP3A4, CYP4A11, 
CYP1B1, CYP1A1, CYP2A6, CYP2A13, CYP2B6, CYP2C8, CYP2C9, CYP11A, CYP2C19. 
CYP2F1, CYP2J2, CYP3A5, CYP3A7, CYP4B1, CYP4F2, CYP4F3, CYP6D1, CYP6F1, 

20 CYP7A1. CYP8, CYP11A, CYP11B1, CYP11B2 , CYP17, CYP19, CYP21A2, CYP24. 
CYP27A1. CYP51. ABCB1. ABCB4. ABCC1. ABCC2, ABCC3, ABCC4. ABCC5, ABCC6. 
MRP7, ABCC8, ABCC9, ABCC10, ABCC11, ABCC12, EPHX1, EPHX2, LTA4H, TRAG3, 
GUSB. TMPT, BCRP. HERO, hKCNE2, UDP glucuronosyl transferase (UGT), 
sulfotransferase, sulfatase, glutathione S-transferase (GST) -alpha, glutathione S- 

2 5 transferase -mu, glutathione S-transferase -pi, ACE, and KCHN2. 

In another aspect, the invention provides the coisogenic cell collection in 
the fonfn of a kit. The kit comprises at least five genotypically distinct ceils, the cells 
contained within separate, stmcturally discrete, fluidly noncommunicating containers, 
wherein each of the genotypically disfinct cells is coisogenic with respect the others at a 

3 0 target locus common thereamong; the structurally discrete containers are commonly 

packaged. 
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In some embodiments, the kit further comprises a computer readable 
medium, recorded upon which is a dataset (typicaiiy, a relational database) that describes 
the target locus genotype of each of said genotypically distinct cells. 

In another aspect, the invention provides a method of making a coisogenic 

5 cell collection. 

In its most basic torn, the method comprises collecting at least 5 
genotypically distinct cells, each of the genotypically distinct cells being coisogenic with 
respect to the others at a target locus common thereamong, into a collection in which each 
of the genotypically distinct cells can be separately assayed. 

1 0 Typically, the coisogenic cells will first be prepared, and the method will 

thus further comprise the antecedent step of engineering, into at least four of five cultured 
cells, the cells having derived from a common eukaryotic ancestor cell, a genomic 
sequence alteration at a target locus common thereamong. For purposes of the present 
invention, the sequence alterations should be sufficient to cause at least five distinct protein 

15 sequences collectively to be encoded by the cells at the target locus. 

In preferred embodiments, the engineering is effected by introducing a 
targeting oligonucleotide into each of said at least four cultured cells. The targeting 
oligonucleotide effects site-specific change to the cellular genomic DNA. Alternatively, in a 
multistep process, a targeting oligonucleotide is first used to effect a change in a genomic 

2 0 recombination-competent substrate, such as an artificial chromosome, and the 

recombination-competent substrate then introduced into each of the four cultured cells. 

In another aspect, the invention provides a kit useful for creating the 
coisogenic cell collections of the present invention. The kit comprises at least four targeting 
oligonucleotides of distinct sequence; and a eukaryotic cell. The targeting oligonucleotides 
25 are sufficient to effect four different sequence changes, each sequence change sufficient to 
alter the protein sequence, at the target genomic locus. 

The coisogenic cell collections of the present invention can be used for 
multiplex, including high throughput multiplex screening for mutations that affect a cellular 
phenotype in vitro. 

3 0 Thus, in another aspect, the invention provides a method of identifying 

genotypes of a target locus that alter a cellular phenotype, comprising a first step of 
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assaying each genotypically distinct cell of a coisogenic cell collection for a common 
phenotypic characteristic. The genotypically distinct cells are coisogenic at the target locus, 
preferably exceptionally or perfectly coisogenic, and/or legacy-free. After assay, the 
method calls for identifying from the assay results at least one cell having an altered 
5 phenotypic characteristic; and conrelating, for the cell or cells with altered phenotypic 
characteristic, the results of said phenotypic assay with the cell's target locus genotype. 
Such correlation of phenotypic assay results with target locus genotype identifies'genotypes 
of the target locus that alter the cellular phenotype. 

Usefully, the phenotypic characteristic can be responsiveness of the cell to 

10 a xenobiotic, and the method can thus include the antecedent step of contacting the 

coisogenic cell collection with a xenobiotic. In certain embodiments of the method, the cells 
of the collection are coisogenic at a target selected from the group consisting of: CYP1A2, 
CYP2C17, CYP2D6. CYP2E, CYP3A4, CYP4A11, CYP1B1, CYP1A1, CYP2A6, CYP2A13, 
CYP2B6, CYP2C8, CYP2C9. CYP11A, CYP2C19, CYP2F1. CYP2J2. CYP3A5. CYP3A7, 

15 CYP4B1, CYP4F2, CYP4F3, CYP6D1, CYP6F1. CYP7A1, CYP8, CYP11A. CYP11B1, 
CYP11B2 , CYP17, CYP19, CYP21A2, CYP24. CYP27A1, CYP51, ABCB1, ABCB4, 
ABCC1, ABCC2. ABCC3, ABCC4. ABCC5. ABCC6. I\^RP7, ABCC8. ABCC9, ABCC10, 
ABCC11, ABCC12. EPHX1, EPHX2, LTA4H, TRAG3, GUSB, TMPT, BCRP, HERG. 
hKCNE2, UDP glucuronosyl transferase (UGT), sulfotransferase, sulfatase, glutathione S- 

2 0 transferase (GST) -alpha, glutathione S-transferase -mu, glutathione S-transferase -pi, 

ACE, and KCHN2. 

The conrelations can thereafter optionally be collected into at least one 
dataset, typically one or more relational databases, usefully recorded on a computer- 
readable medium. 

25 In a further aspect, the invention provides a method of predicting a 

phenotypic characteristic of a cell based upon its genotype at a target locus. The method 
comprises using the cell's genotype at the target locus, or a unique identifier thereof, as a 
query to retrieve from a dataset data that report a conflated phenotypic characteristic, 
wherein the dataset includes such conrelations for at least five cells that are coisogenic at 

3 0 the target locus; the retrieved phenotypic characteristic provides a prediction of the cell's 

phenotypic characteristic. 
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The above and other objects and advantages of the present invention will 
be apparent upon consideration of the following detailed description. 

DETAILED DESCRIPTION OF THE INVENTION 

Definitions 

5 Unless otherwise made explicitly clear by context, the indefinite article "a" 

intends one or more of the objects referenced immediately thereafter. 

As used herein, the term "cell" intends a eukaryotic cell. Unless otherwise 
made explicitly clear by context, the singular term "cell" equally intends a plurality of 
genetically identical cells, such as a plurality of cells from a clonal eukaryotic cell line. A 
10 "cultured cell" is a eukaryotic cell (or clonal eukaryotic cell line) that is maintained alive in 
vitro in nutrient media, or that has previously been propagated in vitro in nutrient media for 
at least one doubling. 

"Genotypically distinct" cells have nonidentical genomic sequences. 
A "target locus" is a genomic region that includes all exons of an 
15 expressed protein. 

As used herein, two genotypically distinct cells are "coisogenic" with 
respect to one another if derived from a common ancestor cell and engineered to differ from 
one another in genomic sequence at a predetermined target locus. The genomic sequence 
differences at the target locus must be sufficient to alter the amino acid sequence encoded 
20 at the target locus by at least one amino acid. The term "coisogenic" permits of changes as 
between the genomes of the genotypically distinct cells additional to the changes at the 
target locus. 

"Exceptionally coisogenic" cells are coisogenic cells that differ in 
genomic sequence by no more than 0.05%, excluding changes at the target locus. 
2 5 "Perfectly coisogenic" cells are coisogenic cells that differ in genomic 

sequence by no more than 0.005%, excluding changes at the target locus. 

Cells, or genetic alterations, therein are said to be "legacy-free" if lacking 
in heterologous genetic elements within 10 kilobases of an engineered genomic sequence 
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alteration. When used with respect to coisogenic cells, the cells are legacy-free if lacking in 
heterologous genetic elements within 10 kilobases of any codon of the target locus. 

As used herein, "heterologous genetic elements" are sequences of 
greater than 25 consecutive nucleotides that derive from - and that can thus be shown to 
5 be present in - species different from that from which the coisogenic cells derive; 
heterologous genetic elements thus include, inter alia, all genetic elements derived from 
prokaryotic cells, including prokaryotic genomic DNA; genetic elements derived from 
prokaryotic episomes, including fertility factors; genetic elements derived from 
bacteriophage; as well as genetic elements from eukaryotic viruses. 
10 As used herein, the terni "collection", as applied to cells, intends that the 

cells are in sufficient spatial proximity to one another as readily and contemporaneously to 
be subject to the same experimental protocol. The term "library" is intended to be 
synonymous with "collection" in all respects. 

As used herein, the term "xenobiotic" intends a foreign compound 
1 5 introduced into a biological system, such as an inorganic or organic compound foreign to 
the ceil or organism under study, or a compound naturally present in the cell or organism 
under study but administered by nonnatura! routes or at unnatural concentrations. 

Coisooenic eukaryotic cell collections, methods of makino. and methods of use 

The present invention Is made possible by our recent discovery of methods 
and compositions, to be described in further detail below, for creating site-specific mutations 
in genomic DNA of eukaryotic cells, including mammalian cells, at efficiencies and with a 
precision not hitherto achievable using homologous recombination or earlier approaches 
based upon oligonucleotide-mediated gene repair. 

The methods permit point mutations to be targeted with high efficiency to 
genomic DNA incubated in cellular extracts, such as artificial chromosomes incubated in 
cellular extracts, and also perniit mutations to be targeted with high efficiency directly into 
the chromosomes of cultured cells. The efficiency is sufficiently high as to obviate the 
concomitant insertion of selectable markers or other exogenous DNA, permitting cells with 
defined mutations to be created legacy-free. These methods pemiit us readily to create 



20 



25 
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collections of coisogenic eukaryotic cell lines, including legacy-free, perfectly coisogenic cell 
lines, that possess targeted and discrete changes at given target lod. 

These collections of coisogenic cells have substantial utility in 
pharmacogenomic studies, obviating the identification of naturally-occurring allelic variants, 

5 observation of naturally occurring clinically-relevant phenotypes in a human population, and 
association of the naturally-occuning allelic variants with the naturally-occuning, dinically- 
relevant phenotypes. In embodiments particularly useful for phamiacogenomic studies, the 
target loci at which the collection of cells are coisogenic encode proteins l<nown to affect 
drug resistance (conversely, drug sensitivity), and drug metabolism. 

LO The collections of coisogenic cells have further utility in studies of the 

structure-activity relationships of existing, and of potential new, therapeutic agents, 
permitting multiplex analysis of the effects of amino acid changes on iigand-receptor 
interactions. The collections of coisogenic cells are also useful in screening for agonists 
and antagonists of proteins that affect drug resistance, sensitivity, and metabolism. 

15 Thus, in a first aspect, the invention provides a collection of at least 5 

genotypically distinct cells, typically as a collection of at least 5 genotypically distinct 
eul<aryotic cell lines. Each of the genotypically distinct cells (or cell lines) is coisogenic to 
the others of the genotypically distinct cells (or cell lines) in the collection at a common 
target locus. In addition, each of the genotypically distinct cells can be separately assayed. 

2 0 Given the generality of our oligonucleotide-mediated mutational approach, 

the cultured cells of the Invention can be any eukaryotic cell amenable to in vitro culture. 

Among mammalian cells, human cells have particular utility, particulariy for 
pharmacogenomic uses. Also very useful, particularly for structure-activity studies, are cells 
from related primates, such as chimpanzee, monkeys (including rhesus macaque), baboon, 

2 5 orangutan, and gorilla, and those from rodents typically used as laboratory models, such as 
rate, mice, hamsters and guinea pigs. Cells can also usefully be firom lagomorphs, such as 
rabbite; and from larger mammals, such as livestock, including horses, cattle, sheep, pigs, 
goats, and bison. Also useful are cells from fowl sudi as chickens, geese, ducks, turkeys, 
pheasant, ostrich and pigeon; fish such as zebrafish, salmon, tilapia, catfish, trout and bass; 

30 and domestic pet species, such as dogs and cats. 
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Plant cells for which coisogenic cell collections can usefully be constmcted 
• according to the methods of the present invention include, for example, experimental model 
plants, such as Chlamydomonas reinhardtii, Physcomitrella patens, and Arabidopsis 
thaliana\ crop plants such as cauliflower {Brassica oleracea), artichoke {Cynara scolymus)] 
5 fruits such as apples {Malus, e.g. Malus domesticus), mangoes {Mangifera, e.g. Mangifera 
indica), banana (Mi/sa, e.g. Muse acuminata), berries (such as cunrant, Ribes, e.g. njbwm), 
kiwifruit (Actinidia, e.g. chinensisl grapes (Wffe, e.g. vinifera), bell peppers {Capsicum, e.g. 
Capsicum annuum), chenries (such as the sweet cherry. Pmnus, e.g. avium), cucumber 
{Cucumis, e.g. sativus), melons {Cucumls, e.g. meld), nuts (such as walnut. Juglans, e.g. 

10 regia\ peanut, Arachis tiypogeae), orange {Citnjs, e.g. maxima), peach {Pmnus, e.g. Pmnus 
persica), pear {Pyra, e.g. communis), plum {Pmnus, e.g. domestica), strawberry {Fragaria, 
e.g. moschata or vesca), tomato {Lycopersicon, e.g. escuientum); leaves and forage, such 
as alfalfa {Medicago, e.g. sativa or tnincatulal cabbage {e.g. Brassica oleracea), endive 
{Cichoreum, e.g. endivia), leek {Aliium, e.g. pomim), lettuce (Lactuca, e.g, sativa), spinach 

15 {Spinacia, e.g. oleraceae), tobacco {Nicotlana, e.g. tabacum); roots, such as anrowroot 
{Maranta, e.g. amndinacea), beet {Beta, e.g. vuigans), carrot (Daucus, e.g. carota), 
cassava {Manitiot, e.g. esculenta), turnip {Brassica, e.g. rape), radish {Raplianus, e.g. 
sativus), yam {Dioscorea, e.g. esculenta), sweet potato {Ipomoea batatas)] seeds, including 
oilseeds, such as beans {Pfiaseolus, e.g. vulgaris), pea {Pisum, e.g. sativum), soybean 

2 0 {Glycine, e.g. max), cowpea ( Wgna unguiculata), mothbean {Vigna aconitifolia), wheat 

{Triticum, e.g. aestivum), sorghum {Sorghum e.g. bicoloi), barley {Hordeum, e.g. vulgare), 
com (Zea, e.g. mays), rice {Oryza, e.g. sativa), rapeseed {Brassica napus), millet {Panicum 
sp.), sunflower {Helianthus annuus), oats {Avena sativa), chickpea {Cicer, e.g. arietinum)', 
tubers, such as kohlrabi {Brassica, e.g. oleraceae), potato {Solanum, e.g. tuberosum) and 
25 the like; fiber and wood plants, such as flax {Unum, e.g. Linum usitatissimum), cotton 
{Gossypium e.g. hirsutum), pine {Pinus spp.), oak {Quercus sp.), eucalyptus {Eucalyptus 
sp.), and the like; and ornamental plants such as turfgrass {Lolium, e.g. rigidum), petunia 
{Petunia, e.g. xhybrida), hyacinth {Hyacinthus orientalis), camation {Diantlius e.g. 
caryophyllus), delphinium {Delphinium, e.g. ajacis), Job's tears {Coix lacrymaiobi), 

3 0 snapdragon {Antirrhinum majus), poppy {Papaver, e.g. nudicaule), lilac (Syringe, e.g. 

vulgaris), hydrangea {Hydrangea e.g. macrophylla), roses (including Gallicas, Albas, 
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Damasks, Damask Perpetuals, CeMias, Chinas, Teas and Hybrid Teas) and ornamental 
goldenrods (e.g. Solidago spp.). 

Given the conservation of basic metabolic pathways among ali eulcaryotes, 
celi collections of the present invention can also usefully be drawn from lower eukaryotes, 
5 such as yeasts, particularly Saccharomyces cerevlsiae, Schizosaccharomyces pombe, 
Pichia species, such as methanolica, Ustillago maydis, and Cancf/da species, from 
roundworms, such as C. elegans, from zebra fish, and firom Dmsophila melanogaster. 

Eukaryotic cell lines from which coisogenic collections of the present 
invention may be created are readily available from a wide variety of sources i<nown in the 
1 0 art, including the American Type Culture Collection (Manassas, VA, USA), the Deutsche 
Sammlung von l\/lil(roorganismen und Zelll<ulturen (DSMZ, German Collection of 
Microorganisms and Cell Cultures), and the Riken Cell bank of Japan; 472 such culture 
collections are listed at http://wdcm.nig.ac.jp/hpcc.html. 

Specialized cell collections are also well known, and include the NIGMS 
15 (National Institute of General Medical Studies) Human Genetic Cell Repository, the NIA 
Aging Cell Repository, the Autism Research Resource, the ADA Cell Repository Maturity 
Onset Diabetes Collection, and the HBDI Cell Repository Juvenile Diabetes Collection, all of 
which are maintained at the Coriell Institute for Medical Studies (Camden, NJ, USA). 
Specialized yeast collections include the National Collection of Yeast Cultures (Institute of 
2 0 Food Research, Nonwich Research Patk, Cdney, Nonivich, UK). 

Existing cell lines are also amply well described in the literature. See, e.g., 
DtBxIer, The Leukemia-Lymphoma Cell Line FactsBook . (ISBN: 0122219708) (2000); Hay 
etal. (eds.). Atlas of Human Tumor Cell Lines. Academic Press, 1994 (ISBN: 0123335302); 
Masters etal. (eds.), Human Cell Culture: Cancer Cell Lines :Leu kemias and Lvmohomas. 

2 5 VqLI Kluwer Academic, 2000 (ISBN: 079236225X); Dix (ed.), Plant Cell Line Selection: 

Procedures and AoDlicatlons. John Wiley and Sons, 1990 (ISBN:3527279636); Panchal 
(ed.). Yeast Strain Selection . Marcel Dekker, 1990 (ISBN: 0824782763). 

Furthermore, methods are well known In the art for creating immortalized 
cell lines ftom a vwde variety of primary cells having advantageous characteristics. For 

3 0 recent reviews see, e.g., Yeager e( a/„ "Constructing immortalized human cell lines," Curr. 

Opin. Biotechnoi 10(5):465-9 (1999); Rhim, "Development of human cell lines from multiple 
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organs," Ann. NY Acad, Set. 919:16-25 (2000); McLean, "Improved techniques for 
immortalizing animal cells," Trends Biotechnoi 11(6):232-8 (1993); and Hopfer a/., 
"Immortalization of epithelial cells," Am. 1 Physiol. 270(1 R 1):C1-C1 1 (1996). 

Although at times preferred for convenience, the genotypically distinct cells 
5 need not be immortalized, or otherwise capable of indefinite propagation. 

The collection includes at least 5 coisogenic cells (typically, as clonal cell 
lines). Higher assay throughput is often obtained when the collection includes greater than 
5, such as 6, 7, 8, 9, or 10 genotypically distinct, coisogenic cells. Collections of 24 
coisogenic cells can conveniently be disposed In a 24 well culture plate; collections of 96 

10 coisogenic cells can conveniently be anrayed in a 96 well mlcrotiter dish. With recent 
development of mlcrotiter dishes with footprint identical to that of the standard microtiter 
dish, but with higher well density, collections of 384, 864, 1536, 3456, 6144, and as many 
as 9600 coisogenic cells can readily and usefully be present in the cell collections of the 
present invention. The collections need not necessarily contain such even numbers of 

15 genotypically distinct exceptionally coisogenic cells, and can thus include any number of 
genotypically distinct coisogenic cells greater than or equal to 5, Including 6, 7, 8, 9, 10, 1 1 , 
12, 13, 14, 15, 16, 17, 18, 19, 20, 30, 40, 50, 60, 70, 75, 80, 85, 90, 95, 100, 200. 300, 400, 
500 or more. 

At least five of the genotypically distinct cells of the collections of the 
2 0 present invention are coisogenic at a common, predetemiined, target locus. The target 
locus can be any protein-encoding locus of the cell. As will be further described below, 
prefenred targets for phamnacogenomic studies encode proteins known to be involved in 
drug resistance and/or drug metabolism. 

As defined herein, coisogenic cells have genomic sequence differences at 
25 the target locus that are sufficient to occasion change of at least one amino acid at the 
target locus. The genotypically distinct cells of the collection are coisogenic to the others of 
the genotypically distinct cells of the collection. 

The methods and compositions for creating the coisogenic cells, which are 
further described below, readily permit the legacy-free substitution, addition, or deletion of 
30 as few as 1 and as many as 3 consecutive nucleotides in the genomic DNA of the target 
locus. 
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Alterations can include, for example, substitutions of one. two or three 
contiguous nucleotides, thus effecting a change in the amino acid encoded by one codon or 
by two adjacent codons. Since the standard genetic code is well known, the nucleotide 
changes required to effect change from any given codon to one that encodes any other 
5 desired amino acid would be apparent to the skilled artisan; examples are also presented 
herein below. 

In one such embodiment, one predetermined amino acid residue is 
commonly targeted for change in each of the coisogenic cells; with a minimum of 20 
genotypically distinct cells in the collection, each of the commonly occurring natural amino 

10 acids can be present in the collection at the target residue. Residues that are particularly 
informative as targets are those that occur in the protein at locations of known structural 
and/or functional importance, such as within highly structured, ligand-binding domains. 

In an alternative embodiment, the genotypically distinct cells can differ not 
at the identical residue, but at successive amino acids of the target protein. By way of 

15 example, each genotypically distinct cell can contain a single alanine substitution. Thus, 
without disturbing the initiator methionine, the first cell of the collection can have alanine 
substituted for residue 2; the second cell of the collection can have alanine substituted for 
residue 3; the third cell of the collection can have alanine substituted for residue 4, efc. 
Collectively, the coisogenic cells of the cell collection present an in vivo alanine scan of the 

2 0 entire protein sequence, permitting ready Identification of critical residues of the target 

protein. 

Any amino acid can be used as the substitute in such an embodiment, with 
the choice dictated by the known chemical and biological properties of the naturally 
occurring amino acids. For example, proline can be substituted to effect disruption of 
25 secondary structures, such as beta sheets or alpha helices; tyrosine can be substituted to 
provide substrates for tyrosine-kinase mediated post-translational modification; glutamic 
acid can be substituted to increase local charge density. 

Alterations can also include introduction of a termination codon. Because 
any codon of the target locus can be targeted, coisogenic cells can be collected that each 

3 0 individually possess a single engineered temfiinatlon codon, but that collectively present 

consecutive, single amino acid truncations from the carboxy terminus of the target protein. 
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Alterations can also include insertion of an amino acid, through targeted 
insertion of a novel codon between two existing codons. 

Alterations can, in other embodiments, include frameshift mutations, 
caused by insertion or deletion of 1 or 2 nucleotides. Frameshift can lead to truncation or 
5 elongation, depending upon presence of termination codons in the new reading frame. 
Introduction of compensating frameshifts (e.g., insertion of a single nucleotide followed, at 
some distance downstream, by deletion of a single nucleotide), can lead to alteration of a 
series of amino acids between the mutated nucleotides. 

Other types of changes that can be created by targeted point mutations will 
10 be readily apparent to one skilled in the art. 

Among the changes that can usefully be made, and that have particular 
utility for pharmacogenomic studies, are those that recapitulate naturally- occuning allelic 
variants at the target locus; such changes permit the phenotype occasioned by a naturally 
occurring alleles to be assessed against a common, defined, genetic background. 
15 As would be understood, highly multiplex analyses can be done by 

combining the mutational embodiments set forth above. For example, the collection can 
include cells that are coisogenic at a first residue of the target locus, with the collection 
including all possible amino acids at that first target residue, with the collection further 
including cells that have substitutions at other residues of the target locus. 
2 0 Greater differences can be achieved by targeting changes iteratively to the 

target locus using the methods of the present invention. 

Furthermore, changes can be Introduced into both alleles of the target 
locus, either in a single step or by iterative modification, thus creating a homozygous 
change. At present, homozygous changes are most desired, although heterozygous 

2 5 changes are permitted. 

In certain embodiments, the coisogenic cells are legacy-free. 

In certain embodiments, our methods for constmcting coisogenic cell 
collections, further described below, can alter genomic DNA without concomitant insertion of 
heterologous nucleic acids, such as selectable markers, prokaryotic genetic elements, 

3 0 bacteriophage genetic elements, or eukaryotic viral elements, at the target locus. Because 

such heterologous nucleic acid close to the target locus can cause unpredictable changes in 
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expression and/or activity of the target protein, they are disfavored, although penmitled, in 

certain embodiments of the cell collections of the present invention. 

Depending on their distance from a common cellular ancestor, the 

coisogenic cells of the present invention will, on occasion, have accumulated genetic 
5 differences at other than the target locus. Such differences are permissible. 

In certain particularly useful embodiments, however, the coisogenic cells of 

the collections of the present invention, are "exceptionally coisogenic", differing in genomic 

sequence by no more than 0.05%, excluding changes at the target locus. In other 

embodiments, the cells are "perfectly coisogenic", differing in genomic sequence by no 
10 more than about 0.005%, excluding changes at the target locus. The exceptionally 

coisogenic cell collections and perfectly coisogenic cell collections of the present invention 

can each, additionally, be legacy-free. 

The coisogenic cells of the cell collections of the present invention can also 

include intentional genetic changes at locations in the genome other than the target locus. 
15 For example, mutations can be targeted to a second target locus, creating 

cell lines that are coisogenic at several targets. 

As another example, markers, including selectable markers, can usefully, 

but optionally, be included, at a site other than the target locus. Such marker can be 

common to all cells in the collection, for example by prior introduction into a cellular 
2 0 ancestor common to all of the genotypically distinct cells, can be unique to each genotype, 

or can be common to some, but not to all, genotypically distinct cells in the collection. 

For example, a selectable marker can commonly be included in all of the 

genotypically distinct cells of the collection to prevent overgrowth, either by cells of the 

same lineage, or by other species. Selectable markers are well knovwi, and the choice 
2 5 thereof will depend upon the species from which the genotypically distinct cells of the 

collection are derived. Selectable markers for use in mammalian cells, e.g., include 

markers that confer resistance to neomycin (G418), blasticidin, hygromycin or to zeocin; 

other well-known selections are based upon the purine salvage pathway. Selectable 

markers in yeast include a variety of auxotrophic markers, such as alleles of URA3, HIS3, 
30 LEU2, TRP1 and LYS2. 
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At the other end of the spectmm, unique markers can be introduced into 
each of the genotypically distinct cells of the collection, allowing each genotypically distinct 
cell (typically, ceil line) in the collection readily to be distinguished. 

For example, the sequence can encode substrate-independent 
5 proteinaceous fluorophores with distinct emission spectra. See, e.g., Palm et ai, "Spectral 
Variants of Green Fluorescent Protein," in Green Fluorescent Proteins . Conn (ed.), Methods 
Enzymol. vol. 302, pp. 378 - 394 (1999)), the disclosure of which is incorporated herein by 
reference. 

The markers can also be intended to distinguish the cells at the nucleic 
1 0 acid, rather than protein, level (genetic "bar codes"). If such bar codes are flanked by 
priming sites that are common to all of the bar codes of distinct sequence,, a single 
amplification reaction (e.g., by PGR), can be used to stoichiometrically to amplify all bar 
codes, the presence and/or frequencies of which can thereafter readily be assayed. See, 
e.g., U.S. Patent No. 6,046,002. 
1 5 Other genetic alterations that can usefully be made outside the target locus 

include those that facilitate assay of the cells of the coisogenic cell collection of the present 
invention, as will be discussed below. 

The target locus for the coisogenic cell collections of the present invention 
can be any locus believed to contribute to a relevant cellular or oipanismic phenotype, and 
2 0 thus usefully includes all proteins that are presently subject to drug screening assays (e.g., 
G protein coupled receptors, protein kinases, zinc finger-containing transcription factors), or 
phannacogenomic analysis (such as ApoE, presenilin 1, presenilin 2, p53, etc.). Particularly 
useful targets In certain embodiments of the present invention are lod that encode proteins 
that affect drug responsiveness, in part because the clinical phenotype can readily be 

2 5 correlated with a cellular phenotype, permitting ready assay in vitro. 

Accordingly, the cell collections of the present invention can usefully be 
coisogenic at loci that encode any one of the P450 enzymes, which are known significantly 
to affect the metabolism of many, if not most, therapeutic agents. 

The cytochrome P450 superfamily includes a large number (as many as 60 

3 0 in human beings) of separate, but related, monooxygenases that play a central role in 

oxidative metabolism of a wide range of compounds, including therapeutic drugs. Although 
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the number of known P450 enzymes is large, and the endogenous substrates of most 
unknown, a half dozen or so appear to be responsible for metabolism of the vast majority of 
prescribed and over-the-counter drugs: CYP1A2, CYP2C17, CYP2D6, CYP2E ("CYP2E1"). 
CYP3A4, and CYP4A11. For recent reviews, see Anzenbacher ef a/., "Cytochromes P450 
5 and metabolism of xenobiotics," Cell. Mol. Ufe ScL 58(5-6):737-47 (2001), and Drug. Then 
BulL 38(12);93-5 (2000). 

The cell collections of the present invention can thus usefully be coisogenic 
at CYP1A2 (cytochrome P450, subfamily I (aromatic compound-inducible), polypeptide 2) 
(also known as CP12, P3-450, P450(PA)). This gene, the human homologue of which is 

10 located about 25 kb away from CYP1A1 on chromosome 15 (at 15q22-qter), encodes a 
member of the cytochrome P450 superfamily of enzymes closely related to CYP1 A1 , The 
gene is aromatic compound-inducible, and is known to metabolize acetaminophen in human 
beings to the cytotoxic metabolite N-acetylbenzoquinoneimine (NABQI), Thatcher et ai, 
Cancer Gene Then 7(4):521-5 (2000). 

15 CYP2C17 can also usefully be targeted. 

CYP2D6 (also known as CPD6, CYP2D. CYP2D@, P450C2D, P450-DB1) 
encodes cytochrome P450, subfamily IID (debrisoquine, sparteine, etc., -metabolizing), 
polypeptide 6, and is known to metabolize as many as 20% of commonly prescribed drugs; 
the cell collections of the present invention can usefully be coisogenic at this locus. 

2 0 The enzyme's substrates include debrisoquine, an adrenergic-blocking 

drug; sparteine and propafenone, both antl-antythmic drugs; and amitryptiline, an 
anti-depressant The gene is highly polymorphic in the population; certain alleles result in 
the poor metabolizer phenotype, characterized by a decreased ability to metabolize the 
enzyme's substrates. The gene is located near two cytochrome P450 pseudogenes on 

25 chromosome 22q1 3.1. 

CYP2E (earlier denominated CPE1, CYP2E1 . P450-J. P450C2E) encodes 
cytochrome P450, subfamily HE (ethanol-inducible), located in the human genome at 
10q24.3-qter, and can usefully be targeted in constructing coisogenic cell collections of the 
present invention. This P450 enzyme localizes to the endoplasmic reticulum and is induced 

30 by ethanol, the diabetic state, and starvation. The enzyme metabolizes both endogenous 
substrates, such as ethanol, acetone, and acetal, as well as exogenous substrates including 
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benzene, carbon tetrachloride, ethylene glycol, and nitrosamines which are premutagens 
found in cigarette smoke. Due to its many substrates, this enzyme may be involved in such 
varied processes as gluconeogenesis, hepatic cirrhosis, diabetes, and cancer. 

Another locus at which the cell collections of the present invention can 
5 usefully be coisogenic is CYP3A4 (also known as CP34, NF-25, P450C3. P450PCN1), 
which encodes cytochrome P450, subfamily IIIA (nifedipine oxidase), polypeptide 4. 

The enzyme encoded by CYP3A4 localizes to the endoplasmic reticulum 
and its expression is Induced by glucocorticoids and some phannaoological agents. This 
enzyme is involved in the metabolism of approximately half the drugs used today, including 
1 0 nifedipine, acetaminophen, codeine, cyclosporin A, diazepam and erythromycin. The 
enzyme also metabolizes some steroids and carcinogens, 

Vinca alkaloids are important chemotherapeutic agents, and their 
phamiacokinetic properties display significant Interindlvidual variations, possibly due to 
CYP3A4-mediated metabolism. See, Yao ef a/., "Detoxication of vinca alkaloids by human 
1 5 P450 CYP3A4-mediated metabolism: implications for the development of drug resistance," 
J. Pharmacol. Exp. Then 294(1 ):387-95 (2000). 

. This gene is part of a cluster of cytochrome P450 genes on chromosome 
7q21 .1 . Previously, another CYP3A gene, CYP3A3, was thought to exist; however, it is 
now thought that this sequence represents a transcript variant of CYP3A4. 
2 0 CYP4A11 (also called CP4Y, CYP4A2, CYP4AII); encodes cytochrome 

P450, subfamily IVA, polypeptide 1 1 , and can usefully serve as a target locus for the 
coisogenic cell collections of the present invention. CYP4A11 encodes a member of the 
cytochrome P450 superfamily of enzymes. This protein localizes to the endoplasmic 
reticulum and hydroxylates medium-chain fatty acids such as laurate and myristate. 

2 5 Other cytochrome P450 enzymes can also usefully be targeted. 

CYP1B1 (synonyms: CP1B, GLC3A). another target at which the cell 
collections of the present Invention can usefully be coisogenic, encodes cytochrome P450, 
subfamily I (dioxin-inducible), polypeptide 1 (glaucoma 3, primary infantile), located in the 
human genome at 2p21. The P450 monooxygenase encoded by this gene localizes to the 

3 0 endoplasmic reticulum and metabolizes procarcinogens such as polycyclic aromatic 

hydrocarijons and 17beta-estradiol. Mutations in this gene have been associated with 
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primary congenital glaucoma; therefore it is thouglit that the enzyme also metabolizes a 
signaling molecuie involved in eye development, possibly a steroid. 

Expression of CYP1B1, as with expression of CYP1A1, has been shown to 
be increased in an anti-estrogen-resistant breast cell line, Brockdorff ef a/., Int J. Cancer 
5 88(6):902-6 (2000), and has been generaily implicated in tumor drug resistance, Rochat ef 
a/., "Human CYP1B1 and anticancer agent metabolism: mechanism for tumor-specific dmg 
inactivation?", J. Pharmacol. Exp. Ther. 296(2):537-41 (2001); McFadyen ef a/., 
"Cytochrome P450 CYP1B1 protein expression: a novel mechanism of anticancer drug 
resistance," Biochem Pharmacol. 62(2):207-12 (2001). 

1 0 CYP1A1 (cytochrome P450, subfamily I (aromatic compound-inducible), 

polypeptide!) (also known as AHH, AHRR, CP11, CYP1. P1450, P450-C, P450DX), the 
human homologue of which is located at 15q22-24, can also usefully be targeted. 
Expression and activity of CYP1 A are known to be induced by some polycyclic aromatic 
hydrocarbons (PAHs), some of which are found in cigarette smoke, and the enzyme is able 

15 to metabolize some PAHs to carcinogenic intermediates; the gene has specifically been 
associated with lung cancer risk. 

CYP1 A activity has been shown to be increased in a breast cell line 
resistant to the antiestrogen compound IC1 1827801, Brockdorff ef a/., "Increased expression 
of cytochrome p450 1 A1 and 1 B1 genes in anti-estrogen-resistant human breast cancer cell 

2 0 lines," /nf. J. Cancer 88(6):902-6 (2000), and has been suggested as a marker for sensitivity 
to anti-cancer drugs, Peters ef a/., "A mutation in axon 7 of the human cytochrome P- 
4501A1 gene as markerfor sensitivity to anti-cancer drugs?", Br. J. Cancer 75(9): 1397 
(1997). 

Another target for which cell collections of the present Invention can 
2 5 usefully be coisogenic is CYP2A6. the human homologue of which is found at 19q13.2, 
encoding cytochrome P450, subfamily IIA (phenobarbital-lnducible), polypeptide 6 (also 
known as CPA6, CYP2A3). CYP2A6 encodes a P450 enzyme that localizes to the 
endoplasmic reticulum; its expression is induced by phenobarbital. The enzyme is known to 
hydroxylate coumarin, and also metabolizes nicotine, aflatoxin B1, nitrosamines. and some 
30 pharmaceuticals. 
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Individuals with certain allelic variants of CYP2A6 are said to have a "poor 
metabolizer" phenotype, meaning they do not efficiently metabolize drugs that are 
substantially metabolized by CYP2A6, such as coumarin, nicotine, or fluoxetine (Prozac®). 
CYP2A6 is part of a large cluster of cytochrome P450 genes from the CYP2A, CYP2B and 
5 CYP2F subfamilies on chromosome 19q. 

CYP2A6 is predominantly responsible for the metabolism of nicotine to 
cotinine, and many allelic variants have been described. See, Zabetian et al, "Functional 
variants at CYP2A6: new genotyping methods, population genetics, and relevance to 
studies of tobacco dependence," Am. J. Mecf. Genet 96(5):638-45 (2000). 
1 0 Another cytochrome P450 enzyme that can usefully be targeted in the 

coisogenic cell collections of the present invention is CYP2A13 (also known as CPAD), the 
human homologue of which is located at 19q13.2. CYP2A13 is phenobarbital-inducible. 
and is highly active in the metabolic activation of a major tobacco-specific carcinogen, 4- 
(methylnitrosamino)-1-(3-pyridyl)-1-butanone, with a catalytic efficiency much greater than 
15 that of other human cytochrome P450 isofonns. Su et a/., "Human cytochrome P450 
CYP2A13: predominant expression in the respiratory tract and its high efficiency metabolic 
activation of a tobacco-specific carcinogen, 4-(methylnitrosamino)-1-(3-pyridyl)-1-butanone," 
Cancer Res 60(18):5074-9 (2000). 

CYP2B6 (alternatively denominated CPB6, IIB1. P450. and CYPIIB6), 

2 0 encoding cytochrome P450, subfemily IIA (phenobarbital-inducible), polypeptide 6, is 

located at 19q13.2 in the human genome, and is a uselul target locus for the coisogenic cell 
collections of the present invention. This P450 enzyme localizes to the endoplasmic 
reticulum and its expression is induced by phenobarbital. The enzyme is known to 
metabolize some xenobiotics, such as the anti-cancer drugs cyclophosphamide and 
25 ifosphamide. Transcript variants for this gene have been described; however, it has not 
been resolved whether these transcripts are in fact produced by this gene or by a closely 
related pseudogene, CYP2B7. Both the gene and the pseudogene are located in the 
middle of a CYP2A pseudogene found in a large cluster of cytochrome P450 genes from 
the CYP2A, CYP2B and CYP2F subfamilies on chromosome 19q. CYP2B6 is though to 

3 0 mediate the N-demethylation of (R)- and (S)-ketamine in human liver. 
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CYP2C8 (same as CPC8. P450 MP-12/MP-20) encoding cytochrome 
P450, subfamily IIC (mephenytoin 4-hydroxylase). polypeptide 8, is also a useful target for 
the coisogenic eukaryotic cell collections of the present invention. This protein localizes to 
the endoplasmic reticulum and its expression is induced by phenobarbital. The enzyme is 
5 known to metabolize many xenobiotics, including the anticonvulsive drug mephenytoin, 
benzo(a)pyrene, 7-ethyoxycoumarin, and the anti-cancer drug paclitaxel (Taxol®). CYP2C8 
also metabolizes cerivastatin, which is a high potency, third generation synthetic statin with 
proven lipid-lowering efficacy. 

Two transcript variants for this gene have been described; it is thought that 
10 the longer form does not encode an active cytochrome P450 since its protein product lacks 
the heme binding site. This gene is located within a cluster of cytochrome P450 genes on 
chromosome 10q24. 

Another useful target for the coisogenic cell collections of the present 
invention is CYP2C9 (cytochrome P450. subfamily IIC (mephenytoin 4-hydroxylase), 
15 polypeptide 9), whose expression is induced by rifampin, and which is known to metabolize 
many xenobiotics, including phenytoin, tolbutamide, ibuprofen, aspirin and S-warfarin. See, 
e.g., Bigler et a/., "CYP2C9 and UGT1A6 genotypes modulate the protective effect of 
aspirin on colon adenoma risk," Cancer Res. 61(9):3566-9 (2001). 

Studies identifying individuals who are poor metabolizers of phenytoin and 
2 0 tolbutamide suggest that this gene is polymorphic. The gene is located within a cluster of 
cytochrome P450 genes on chromosome 10q24. 

CYP11A (same as P450SCC, cytochrome P450C1 1A1), also usefully 
targeted In the coisogenic cell collections of the present invention, encodes a member of the 
cytochrome P450 superfamily of enzymes. This protein localizes to the mitochondrial inner 

2 5 membrane and catalyzes the conversion of cholesterol to pregnenolone, the first and 

rate-limiting step in the synthesis of the steroid homiones. The human homologue Is 
located at 15q23-q24. 

CYP2C19 (same as CPCJ, CYP2C, P450C2C, P450IIC19. microsomal 
monooxygenase, xenobiotic monooxygenase. mephenytoin 4'-hydroxylase, 

3 0 flavoprotein-linked monooxygenase), encodes cytochrome P450, subfamily IIC 

(mephenytoin 4-hydroxylase), polypeptide 19. This protein localizes to the endoplasmic 
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reticulum and is known to nietaboiize many xenoblotics, including the anticonvulsive drug 
mephenytoin, omeprazole, diazepam, proguanil, and some barbiturates. The enzyme is 
also responsible for the polymorphic (NAT2*) acetylation of hydrazine and aromatic amine 
drugs, such as isoniazid, hydralazine, and sulfasalazine. Polymorphism within this gene is 
5 associated with variable ability to metabolize mephenytoin, known respectively as the poor 
metabolizer phenotype and extensive metabolizer phenotype. The gene is located within a 
cluster of cytochrome P450 genes on chromosome 10q24, at 10q24.1-q24.3. 

Other cytochrome P450 enzymes that can usefully be targeted to create 
the coisogenic cell collections of the present invention include CYP2F1 , CYP2J2, CYP3A5, 

1 0 CYP3A7 (catalyzes the prenatal 4-hydroxylation of retinoic ackl, playing an important role in 
protecting the human fetus against retinoic acid-induced embryotoxicity, Chen et ah, 
"Catalysis of the 4-hydroxylation of retinoic acids by cyp3a7 in human fetal hepatic tissues," 
Drug. Metab. Dispos. 28(9):1051-7 (2000)), CYP4B1, CYP4F2 (found to catalyze 
hydroxylation and dealkylation of an H(1)-antihistamine prodrug, ebastine, Hashizume ef a/., 

15 "A novel cytochrome p450 enzyme responsible for the metabolism of ebastine in monkey 
small intestine," Drug Metab. Dispos. 29(6):798-805 (2001)). CYP4F3, CYP6D1. CYP6F1 
(related to CYP6D1 and involved in pyrethroid detoxification in insects), CYP7A1, CYP8, 
CYP11A, CYP11B1, CYP11B2 , CYP17, CYP19. CYP21A2, CYP24, CYP27A1, and 
CYP51. 

2 0 Other loci that affect drug resistance are also useful targets for 

oligonucleotide-mediated alterations for creating eukaryotic coisogenic cell collections of the 
present invention. 

Among such non-P450 loci are the genes encoding ATP-binding cassette 
(ABC) proteins, which transport various molecules across extra- and intra-cellular 
2 5 membranes. ABC genes are divided into seven distinct subfamilies (ABC1 . MDR/TAP. 
MRP. ALD, OABP, GCN20, White); some members are well known to confer a multi-drug 
(multiple drug) resistance phenotype on tumor cells. 

Best known among the ABC proteins is ABCB1 (ATP-binding cassette, 
sub-family B (MDR/TAP), member 1), known alternatively as MDR1 (multi drug resistance 
30 1), P-GP (P-glycoprotein). PGY1, ABC20, and GP170, the human homologue of which 
mapsto7q21.l 
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The protein encoded by this gene is an ATP-dependent drug efflux pump 
for xenobiotic compounds with broad substrate specificity. It is responsible for decreased 
drug accumuiation in multidrug-resistant cells and often mediates the development of 
resistance to anticancer drugs. A number of studies have demonstrated a negative 
5 con-elation between Pgp expression levels and chemosensitivity or survival in a range of 
human malignancies. Lehne. "P-glycoprotein as a drug target in the treatment of multidrug 
resistant cancer," Cum Drug Targets 1(1):85-99 (2000). 

P-glycoprotein is also expressed in normal tissues with excretory function 
such as liver, kidney and intestine. Apical expression of P-glycoprotein in such tissues 

10 results in reduced drug absorption from the gastrointestinal tract and enhanced drug 
elimination into bile and urine. Moreover, expression of P-glycoprotein in the endothelial 
cells of the blood-brain barrier prevents entry of certain drugs into the central nervous 
system. Human P-glycoprotein has been shown to transport a wide range of structurally 
unrelated drugs such as digoxin, quinidine, cyclosporin and HIV-1 protease inhibitors. 

15 Studies in humans indicate a particular importance of intestinal P-glycoprotein for 
bioavailability of the immunosuppressant cyclosporin. Moreover, induction of intestinal 
P-glycoprotein by rifampin has now been identified as the major underlying mechanism of 
reduced digoxin plasma concentrations during concomitant rifampin therapy. For reviews, 
see Fromm, "P-glycoprotein: a defense mechanism limiting oral bioavailability and CNS 

20 accumulation of drugs," Int J. Clin. Pharmacol. Then 38(2):69-74 (2000); Schinkel. 
"P-Glycoprotein, a gatekeeper in the blood-brain barrier," Adv. Drug Deliv. Rev, 
36(2-3):179-194 (1999); Van Asperen et ai, "The pharmacological role of P-glycoprotein in 
the intestinal epithelium," Pharmacol Res. 37(6):429-35 (1998); Tanigawara, "Role of 
P-glycoprotein In drug disposition," Then Drug Mon/f. 22(1 ):1 37-40 (2000); and Schinkel, 

2 5 "The physiological function of drug-transporting P-glycoproteins," Semh Cancer Biol. 
8(3):161-70 (1997). 

Allelic variants of ABCB1 (MDR1) are known to affect its selectivity and/or 
activity. Hoffmeyer ef a/., "Functional polymorphisms of the human multidrug-resistance 
gene: multiple sequence variations and correlation of one allele with P-glycoprotein 

3 0 expression and activity in vivo." Proc. Natl. Acad. Sci USA 97(7):3473-8 (2000); Choi et ai, 
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"An altered pattern of cross-resistance in multidrug-resistant human cells results from 
spontaneous mutations in the mdr1 (P-glycoprotein) gene," Ce//53(4):519-29 (1988). 

ABCB4 (ATP-bindIng cassette, sub-family B (IVIDR/TAP), member 4)(also 
known as MDR3, PGY3. ABC21 , MDR2/3, PFIC-3) (human homologue maps to 7q21.1), is 
5 another useful target locus for the coisogenic cell collections of the present invention. 

The membrane-associated protein encoded by this gene is a member of 
the superfamily of ATP-binding cassette (ABC) transporters. ABCB4 is a member of the 
MDR/TAP subfamily. Members of the IVIDR/TAP subfamily are involved in multidrug 
resistance as well as antigen presentation. This gene encodes a full transporter and 
1 0 member of the p-glycoprotein family of membrane proteins with phosphatidylcholine as Its 
substrate. 

ABCCl- ATP-binding cassette, sub-family C (CRR/MRP), member 1 - 
(same as MRP. ABCC, GS-X, MRP1 . ABC29) is a member of the MRP subfamily of ATP- 
binding cassette (ABC) proteins, and Is involved in multi-drug resistance. This protein 
1 5 functions as a multispeclfic organic anion transporter, with oxidized glutathione, cysteinyl 
leukotrienes, and activated aflatoxin B1 as known substrates. This protein also transports 
glucuronides and sulfate conjugates of steroid hormones and bile salts. Alternative splicing 
by exon deletion results in several splice variants but maintains the original open reading 
frame in all fonns. 

2 0 ABCC2 (same as DJS, MRP2, cMRP, ABC30. CMOAT, Canalicular 

multispeclfic organic anion transporter) encodes ATP-binding cassette, sub-family C 
(CFTR/MRP), member 2, and is a useful target locus for the coisogenic cell collections of 
the present invention. ABCC2 is a member of the MRP subfamily of ATP binding cassette 
proteins, and is involved in multi-drug resistance. This protein is expressed in the 

2 5 canalicular (apical) part of the hepatocyte and functions in biliary transport. Known 

substrates include anticancer drugs such as vinblastine. 

Another ATP binding cassette protein usefully targeted in ttie coisogenic 
cell collections of the present invention is ABCC3 (also known as MLP2, MRP3, ABC31, 
CM0AT2, MOAT-D, EST90757). the human homologue of which is located at 17q22. The 

3 0 protein may play a role in the transport of biliary and intestinal excretion of organic anions. 

Alternative splicing of this gene results in three known transcript variants. 
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Also a useful target for the coisogenic cell collections of the present 
invention is ATP-binding cassette, sub-family C (CFTR/MRP). member 4, ABCC4 . also 
known as MRP4, MOATB. MOAT-B, EST170205. The protein encoded by this gene is a 
member of the MRP subfamily of ABC transporters, and Is involved in multi-drug resistance. 
5 The protein may play a role in cellular detoxification as a pump for its substrate, organic 
anions. 

Other useful ABC transporter proteins that can usefully serve as the target 
locus for the coisogenic cell collections of the present invention include ABCC4 (I\/IRP4), 
ABCC5 (MRP5) (provides resistance to thiopurine anticancer drugs, such as 

1 0 6-mercatopurine and thioguanine, and the anti-HIV drug 

9-(2-phosphonylmethoxyethyl)adenine; this protein may be involved in resistance to 
thiopurines in acute lymphoblastic leukemia and antiretroviral nucleoside analogs in 
HIV-infected patients); ABCC6 (MRP6). IV1RP7 (CFTR), ABCC8 {MRP8), ABCC9, ABCC10, 
ABCC11 (same as HI, SUR, MRP8. PHHI, SUR1, ABC36, HRINS), and ABCC12 (same as 

15 MRP9). 

Other useful targets include EPHX1 (epoxide hydrolase 1, microsomal 
xenobiotic), EPHX2 (epoxide hydrolase 2), LTA4H (leukotriene A4 hydrolase), TRAG3 
(Taxol® resistance associated gene 3, which is overexpressed in most melanoma cells and 
confers resistance to paclitaxel, Taxol®), 6USB (beta-glucuronidase), TMPT (thiopurine 
2 0 methyltransferase), BCRP, (breast cancer resistance protein, an ATP transporter), 

dihydropyrihidine dehydrogenase, HERG (involved in drug transport through potassium ion 
channels), hKCNE2 (involved in drug transport through potassium ion channels), UDP 
glucuronosyl transferase (UGT) (a hepatic metabolizing enzyme, a detoxifying enzyme for 
most carcinogens after different cytochrome P450 (CYP) isofomfis), sulfotransferase, 

2 5 sulfatase, and glutathione S-transferase (GST) -alpha, -mu, -pi (which detoxily therapeutic 

drugs, not least several anti-cancer dmgs). ACE (peptidyl-dipeptidase A), and KCHN2 
(potassium voltage-gated channel, subfamily H (eag-related), member 2), location 7q35- 
q36). 

Another protein usefully targeted in the coisogenic cell collections of the 

3 0 present invention is the BCR-ABL fusion responsible for chronic myeloid leukemia. The 

tyrosine kinase domain of the fusion protein is targeted by imatinib (Gleevec); allelic 
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variants have been identified that confer polyclonal resistance to the drug. Shah et ai, 
Cancer Ce// 2:1 17-125 (2002), incorporated herein by reference in its entirety. 

Another protein usefully targeted in the coisogenic cell collections of the 
present invention is beta tubulin. Paclitaxel is a tubulin-disrupting agent that binds 
5 preferentially to beta-tubulin. Allelic variants of beta tubulin have been identified that confer 
resistance to paclitaxel. Giannakakou ef a/., J. BloL Chem, 272:17118-17125 (1997), 
incorporated herein by reference in its entirety. 

As noted above, the coisogenic cell collections of the present invention can usefully Include 
1 0 cells that have, at the coisogenic target locus, the sequence of a naturally-occurring allele; 
this pennits the phenotype confen-ed by the allele to be assessed without the confounding 
presence of other genetic differences at the target locus or elsewhere in the cellular 
genome. Accordingly, the coisogenic cell collections of the present invention can usefully 
include cells that have the naturally occun^ing (allelic) variants set forth in the following 
15 tables. 
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Table 1 



Gene 
(Synonyms) 
ABCB1 

5 "ATP-blnding 
cassette, sut)- 
family B 
(MDRTTAP). 
member 1" 
10 (MDR1,P-GP, 
PGY1, ABC20, 
GP170, P- 
Glycoproteln) 



15 DNA Variant 
GGA>GTA 



G2995A 

GCT>TCT. 

G2677T 

20 GCT>ACT, 
G2677A 

AAT>GAT. 
A61G 

AGT > AAT. 
25 G1199A 



GAG > COG, 
A3320C 



mRNA/ 
Protein 

4643 bp 
1279 aa 



Locus Accession #s 

7q21.1 X58723and 
X59732 

AG002457. 
AC005068 (g) 
AF016535. 
M14758 (m) 
NP_000918(p) 



Alieiic Variants from Scientific Literature 



Structural 
information 



Protein 
Variant 
Gly185Vai 



AIa999Thr 
Ala893Ser 



Ala893Thr 



Asn21Asp 



Ser400Asn 



Gin1107Pro 



Phenotype 

Correlated witli 
increased 
colchicine 
resistance 
Uni<nown 
"coHBlations of 
mutations with 
expression levels" 
"correlations of 
mutations with 
expression levels" 
Unlcnown 



"may con-elate 
with low 

expression" WO 
01/09183 (p40) 
Unknown 



References 

OMIM 171050, Safe etaL (1990), 
Choi efa/.(1988) 



IVIickiey et a/. (1998) 

Tanabe etaL (2001). Cascorbi et 

al. (2001) 

Tanabe etaL (2001), Cascorbi et 
aL (2001) 

Cascorbi etaL (2001). Hoffmeyer 

efa/.(2000). 

WO 01/09183 

Cascorbi etaL (2001), iHoffmeyer 
etaL (2000), 
WO 01/09183 

Cascorbi etaL (2001) 



30 



CAG > ???. 
A3320? 
TTC > CTC, 

T307C 

ATOATT, 
C3435T 



Phe103Ser Unknown 
Phe103Leu Unknown 



Ile1145lle 
(wobble) 



Correlated with 
(2X) lower p- 
glycoprotein 
expression and 
activity 



WO 01/09183 (p7) 

Hoffmeyer et aL (2000), 
wool/09183 

OMIM 171050. Hoffmeyer etaL 
(2000) 
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Contig 
Acx^ession 



Contig 
Position 



Allelic Variants from SNP Database 



NT 017168 4730224 



dbSNPrs# 

(Cluster ID) 
rs2235039 



Protein 
Accession 
XP 029059 



5 NT_017168 4735268 rs2032581 XP 029059 



dbSNP Protein Codon Amino 
Allele Residue Position Acid 
g V 1 801 
a M 

a I 1 829 
g V 



Table 2 



10 



15 



20 



Gene 

(Synonyms) 
ABCB4 

"ATP-binding 
cassette, sub- 
family B 
(MDRH-AP), 
member 4" 
(MDR3, PGY3, 
ABC21, 

MDR2/3, PFIC- 
3,P- 

glycoprotein 3) 



DNA Variant 
CGA>TGA 



Locus 

7q21.1 



Accession #s mRNA / 

Protein 

NT_017168 5764,5785,1 
(working draft and 5623 bp 

chromo7) 1279. 1286, 

M23234,Z35284 (m) and 1 232 aa 



Allelic Variants from Scientific Literature: 

Protein Variant Phenotype References 
Arg957Ter Cholestasis OMIM 171060 

Allelic Variants from SNP Database: 



Structural 
information 



25 



Contig 
Accession 



Contig 
Position 



dbSNPrs# 
(Cluster ID) 



Protein 
Accession 



NT_017168 4860286 rs31655 XP_004599 



dbSNP 
Allele 

g 

a 



Protein 

Residu 

e 

A 

T 



Codon Amino 
Position Acid 



1 



1107 
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Tables 



Accession #s 


mRNA/ 




Protein 


AF022824 ^exon2^ 


5927, 5749, 




and 5759 bo 


AF022826 (exon4) 


1531.1472, 


AF022827 (exonS) 


and 1475 aa 


AF022828 (exonS) 




AF022829 (exon7) 




AF022830 (exonS) 




AF022831 (exon9) 




AF022832 (exonIO) 




AF017145 




(5'flanking 




sequence) 




L05628.U91318{m) 




NP 004987 isoform 




1(P) 




NP 063915 isoform 




2(P) 




NP 063953 isoform 




3(P) 





10 



15 



Gene 

(Synonyms) 
ABCC1 

"ATP-binding 
cassette, sub- 
family C 
(CFTR/MRP), 
member 1" 
(MRP.ABCC, 
GS-X. MRP1, 
ABC29) 



DNA Variant 

G128C 
C218T 
G2168A 

G3173A 



Locus 
16p13.1 



Structural 
Information 



Allelic Variants from Scientific Literature: 



Protein Phenotype 
Variant 

Cys43Ser Unknown 
Thr73Ile Unknown 
Arg723GI Unknown 
n 

Arg1058G Unknown 

In „^ 



References 

ltoefa/.(2001) 
Itoefa/. (2001) 
Itoefa/. (2001) 

ltoefa/.(2001) 
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Table 4 



10 



15 



Gene 

(Synonyms) 
ABCC2 

"ATP-binding 
cassette, sub- 
family C 
(CFTR/MRP), 
member 2" 
(DJS, MRP2, 
cMRP, ABC30, 
CMOAT) 



DNA Variant 
C2302T 



A4145G 

G1249A 
C2366T 

G4348A 



Locus 



10c!24 



Accession #s 

NT_029377 
(working draft 
chromolO) 
U63970 (m) 
NP„000383 (p) 



mRNA/ 
Protein 

4868 bp 
1545 aa 



Structural 
Information 



Allelic Variants from Scientific Literature: 



Protein Phenotype 
Variant 

Arg768Tr Dubin-Johnson 

p syndrome 



Gln1382A 

rg 

Val417lle 
Ser789Ph 
e 

Ala1450T 
hr 



Dubin-Johnson 
syndrome 
Unknown 
Unknown 

Unknown 



References 

OMIM 601107, Toh etal. (1999). 
Wada efa/. (1998), MoetaL 
(2001) 

OMIM 601107. Toh etal. (1999). 

lio etal. (2001) 
\to etaL (2001) 

ltoefa/.(2001) 



wo 03/027264 



PCT/US02/31180 



- 32 - 



Table 5 



Accession #s 


mRNA/ 




Protein 


NT_010783 


5176, 5325, 


(working draft 


and 5380 bp 


chromo17) 


1527, 1238. 




and 510 aa 


AF009670 (m) 




AF085690 (m) 




AF085691 (m) 




AF085692 (m) 




NP 003777 (p) 




NP 064421 (p) 




NP 064422 (p) 





Locus 
17q22 



10 



15 



Gene 

(Synonyms) 
ABCC3 

"ATP-binding 
cassette, sub- 
family C 
(CFTR/MRP), 
member 3" 
(MLP2. MRP3. 
ABC31. 
CM0AT2, 
IVIOAT-D. 
EST90757) 



Contig 



NT 010783 1619267 



Structural 
Information 



Allelic Variants from SNP Database: 



dbSNPrs# 


Protein 


dbSNP Protein 


Codon 


Amino 


(Ciuster ID) 


Accession 


Allele 


Residue Position Acid 


rs1051625 


XP_008422 


c 


L 


1 


120 






9 


V 








XP_037992 


c 


T 
R 


2 


527 


rsl 003355 


XP_037992 


g 

c 


A 


2 


528 






g 


6 






rs967935 


XP„037992 


c 


S 


2 


1221 






t 


F 






rsl 003354 


XP_037994 


c 


T 


2 


527 






g 


R 






rs1 003355 


XP_037994 


c 


A 


2 


528 






g 


Q 






rsl 051 625 


XP.037994 


c 


L 


1 


1362 






g 


V 






rs1003354 


XP_037997 


c 


T 
R 


2 


454 


rs1003355 


XP_037997 


g 

c 


A 


2 


455 






g 


G 






rs1051625 


XP_037997 


c 


L 


1 


1289 






g 


V 






rsl 003354 


XP_.037999 


c 


T 


2 


527 






g 


R 






rsl 003355 


XP_037999 


c 


A 


2 


528 






g 


G 






rs1051625 


XP_037999 


c 


L 


1 


1362 






g 


V 






rsl 003354 


XP_038002 


c 


T 


2 


527 






g 


R 






rsl 003355 


XP 038002 


c 


A 


2 


528 
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NT 010783 1635643 rs1051625 XP 038002 



g 

c 

JL 



G 
L 
V 



1362 



Table 8 



10 



15 



20 



25 



Gene 

(Synonyms) 
ABCC5 

"ATP-bindIng 

cassette, sub- 

fannily C 

(CFTR/MRP). 

member 5" 

(MRP5, SMRP. 

ABC33. 

MOATC, 

MOAT-C. 

pABCH, 

EST277145) 



Locus Accession #s 

3q27 NT_022676 
(working draft 
chromo3) 
NP_005679 (m) 

NP_005679 (p) 



mRNA/ 
Protein 
5838 bp 
1437 aa 



Allelic Variants from SNP Database: 



Structural 
Information 



Contig Contig dbSNPrs# Protein 
Accession Position (Cluster ID) Accession 



dbSNP Protein Codon Amino 
Allele Residue Position Acid 



NT_022676 100964 

NT_022676 124876 

NT_022676 100964 

NT 022676 124876 



rsl 053351 
rsl 053387 
rs1053351 
rs1053387 



XP_002914 
XP_002914 
XP_037577 
XP 037577 



c 
a 
c 
a 
c 
a 
c 
a 



T 

N 

Y 
* 

T 

N 



3 
2 
3 
2 



1202 
1383 
711 
892 
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Table 7 



10 



15 



20 



25 



Gene 

(Synonyms) 
ABCC6 

"ATP-binding 
cassette, sub- 
family C 
(CFTR/MRP), 
member 6" 
(ARA, PXE, 
MLP1,MRP6, 
ABC34, 
MOATE. 
EST349066) 



Locus 
16p13.1 



Accession #s 

U91318(human 
BAG clone) 

AF076622 (m) 
NP_001162 (p) 



mRNA/ 
Protein 

4535 bp 
1503 aa 



Allelic Variants from Scientific Literature: 



DNA Variant 


Protein 


Phenotype 


References 




Variant 






C3421T 


Arg1141T 


Pseudoxanthoma 


OMIM 603234 




er 


Elastlcum 




G3413A 


Arg1138G 


Pseudoxanthoma 


OMIM 603234 




In 


Elastlcum 




G3341C 


Arg1114P 


Pseudoxanthoma 


OMIM 603234 




ro 


Elastlcum 




C3940T 


Arg1314T 


Pseudoxanthoma 


OMIM 603234 




rp 


Elastlcum 






Arg1268G 


Pseudoxanthoma 


OMIM 603234 




In 


Elastlcum 




C3412T 


Arg1138T 


Pseudoxanthoma 


OMIM 603234 




rp 


Elastlcum 




C3490T 


Arg1164T 


Pseudoxanthoma 


OMIM 603234 




er 


Elastlcum 





Allelic Variants from SNP Database: 



Structural 
Information 



Contig Contig dbSNPrs# Protein 
Accession Position (Cluster ID) Accession 



dbSNP Protein Codon Amino 
Allele Residue Position Acid 



30 



NT 010393 2241302 rs2238472 XP_007798 g R 2 1268 

a Q 

NT 010393 2241302 rs2238472 XP_027249 g R 2 33 

a Q 
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Table 8 



10 



15 



20 



25 



Gene 

(Synonyms) 
ABCC8 

"ATP-binding 
cassette, sub- 
family C 
(CFTR/MRP), 
member 8" 
(HI, SUR, 
MRP8, PHHI, 
SUR1.ABC36, 
HRINS) 



DNA Variant 
G>T 



G4058C 



C4261T 



C4480T 



mRNA/ 
Protein 

4977 bp 
1581 aa 



Locus Accession #s 

11 pi 5.1 L78243 (exon39) 
U63455 (exonSQ 
and complete cds.) 
NT_009307 
(working draft 
chromoll) 
AHG04854 (m) 
NP_000343 (p) 



Allelic Variants from Scientific Literature: 



Protein Plienotype 
Variant 

Gly716Val Persistent 



Arg1353P 
ro 



Arg1421C 

ys 



Arg1494T 
rp 



Hyperinsulinemic 

Hypoglycemia of 

Infancy 

Persistent 

Hyperinsulinemic 

Hypoglycemia of 

Infancy 

Persistent 

Hyperinsulinemic 

Hypoglycemia of 

Infancy 

Persistent 

Hyperinsulinemic 

Hypoglycemia of 

Infancy 



References 
OMII^ 600509 

OMIM 600509 

OMIM 600509 

OMIM 600509 



Allelic Variants from SNP Database: 



Structural 
Information 



Contig 



Contig 



dbSNPrs* 


Protein 


dbSNP Protein 


Codon 


Amino 


(Cluster ID) 


Accession 


Allele 


Residue Position Acid 


rsl 048098 


XP_036346 


t 


F 


1 


157 


rs1048096 


XP_036346 


c 
c 


L 
L 


1 


167 






9 


V 






rs1048095 


XP_036346 


t 


L 


2 


225 






c 


P 






rsl 048094 


XP_036346 


c 


A 


2 


256 






t 


V 






rs757110 


XP_036346 


g 


A 


1 


1369 






t 


S 
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Table 9 



Gene 

(Synonyms) 
5 ACE 

"angiotensin I 
converting 
enzyme 
(peptidyl- 
10 dipeptidase A) 

r 

(ACE1.DCP1, 
CD143) 



15 DNA Variant 
A2350G 



Locus 



17q23 



Accession #s 

NT_010698 
(worldng draft 
chromo17) 
J04144 (m) 
NP J00780 (p) 



mRNA/ 
Protein 
4020 bp 
1306 aa 



Allelic Variants from Scientific Literature: 



Protein Phenotype 
Variant 

? "significantly 

associated with 
blood pressure" 



References 
OMm 106180 



Allelic Variants from SNP Database: 



Structural 
Information 



Contig Contig dbSNPrs# Protein 

Accession Position (Cluster ID) Accession 

20 NT_010698 1458291 rs4348 XP_008260 

NT 010698 1460620 rs4976 XP_008260 



dbSNP Protein Codon Amino 
Allele Residue Position Acid 
c P 2 5 

t L 

t I 2 94 

c T 
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Table 10 



10 



15 



Gene 
(Synonyms) 
CYP1A1 

"cytochrome 

P450, 

subfamily 1 

(aromatic 

compound- 

Inducible), 

polypeptldel" 

(AHH, AHRR. 

CP11.CYP1, 

P1-450, P450- 

C, P450DX) 



Locus 



15q22-24 



Accession #s 

X02612 
X04300 
X02612 (m) 
X04300 (m) 
NP_000490 (p) 



mRNA/ 
Protein 

2602 bp 
512 aa 



Structural 
Information 



Allelic Variants from Scientific Literature: 



DNA Variant 



Protein 
Variant 
Ala462Val 



Plienotype 

Correlated with 
increased risk of 
lung cancer, but may 
be just marker 



References 



OIVIIM 108330 



Allelic Variants from SNP Database: 



20 



Contig Contig dbSNPrs# Protein dbSNP Protein Codon Amino 



Accession 
NT_.010374 


Position 
225016 


(Cluster ID) 
rsl 048943 


Accession 
XP_007727 


Allele 
a 


Residue Position 
1 1 


Acid 
462 


NT_010374 


225018 


rsl 799814 


XP_007727 


g 

c 
a 
c 
t 


V 

T 
N 
R 
W 


2 


461 


NT_010374 


227193 


rs2229150 


XP_007727 


1 


93 



wo 03/027264 



PCT/US02/31180 



- 38 - 



10 



15 



20 



25 



30 



Gene 
(Synonyms) 
CYP1B1 

"cytochrome 

P450. 

subfamily I 

(dioxin- 

inducible). 

polypeptide 1 

(glaucoma 3, 

primary 

Infantile)" 

(CP1B. 

GLC3A) 



DNA Variant 

G3976A 
T3807C 
G1505A 

G7957A 

C8242T 

G3987A 
? 



Locus 
2p21 



Table 11 
Accession #s 

U56438 

X04300 (g) 
X02612 (g) 
U66438 (g) 
U03688 (m) 
NP_000095 (p) 



mRNA/ 
Protein 

5128 bp 
543aa 



Structural 
information 



Ailellc Variants from Scientific Literature: 



Protein 
Variant 
Trp57Ter 
MetUhr 
Lys387GI 
u 

Asp374As 
n 

Arg469Tr 
P 

Gly61Glu 
Gly365Trp 



Phenotype 

Peters Anomaly 
Peters Anomaly 
Glaucoma 

Glaucoma 

Glaucoma 

Glaucoma 
Glaucoma 



References 

OMIM 601771 
OMIM 601771 
OMIM 601771 

OMIM 601771 

OMIM 601771 

OMIM 601771 
OMIM 601771 



Allelic Variants from SNP Database: 



Contig Contig dbSNPrs# Protein 
Accession Position (Cluster ID) Accession 



dbSNP Protein Codon 
Allele Residue Position 



Amino 
Acid 



NT_005274 


679631 


rs10012 


XP_002576 


C 


R 


1 


48 










g 


G 






IMT_005274 


679844 


rs1056827 


XP_002576 


g 
t 

g 


A 
S 


1 


119 


NT_005274 


683818 


rsl 056836 


XP_002576 


V 
L 


1 


432 


NT_005274 


683871 


rs1056837 


XP_002576 


c 

I 

a 
a 
fl 


D 
E 


3 


449 


NT_005274 


683882 


rsl 800440 


XP_002576 


N 
S 


2 


453 
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39 



10 



15 



Gene 
(Synonyms) 
CYP2A6 

"cytochrome 
P450, 

sub^mily HA 

(phenobarbital- 

mducible), 

poiypeptide 6" 

(CPAS, 

CYP2A3) 



DNA Variant 

? 



Locus 



19q13.2 



Table 12 
Accession #s 

U22027 

NM„000762 (m) 
NP 000753 (p) 
NG^OOOOOS (g) 



mRNA/ 
Protein 

1751 bp 
494 aa 



Structural 
Information 



Allelic Variants from Scientific Literature: 

Phenotype References 

OMIM 601771 



Protein 
Variant 
LeuieOHi 

8 



Protein becomes 
"catalyticatly 
Inactive" 



Table 13 



20 



25 



30 



Gene 
(Synonyms) 
CYP2A7 

"cytochrome 
P450, 

subfamily HA 

(phenobarbitai- 

Inducible), 

polypeptide 7" 

(CPA7. CPAD. 

CYPIIA7, 

P450-IIA4) 



DNA Variant 
T>A 



Locus Accession #s 

19q13.2 NT_029481 
(working draft 
chromo19) 

NG_000008 (g) 
NM_000764 (m) 
NP_000755 (p) 
NP085079 (p) 



mRNA/ 
Protein 

2281 bp and 

2128 bp 

494 and 443 aa 



Structural 
Information 



Allelic Variants from Scientific Literature: 

Phenotype References 
Uknown OMII^ 122720 



Protein 
Variant 
LeuieOHi 
s 
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Table 14 



10 



Gene 
(Synonyms) 
CYP2C8 

"cytochrorne 
P450, 

subfamily IIC 

(mephenytoin 

4-hydroxylase), 

polypeptide 8" 

(CPC8. 

P450 MP- 

12/MP-20) 



mRNA/ 

Protein 
1851 and 1890 
bp 

490 and 393 aa 



Locus Accession #s 

10cen- L16876(exon9) 
q26.11 NT_008769 

(working draft 10) 

NM_000770 (m) 
NM_030878 (m) 
NP_000761 (p) 
NP_1 10518 (p) 



Allelic Variants from SNP Database: 



Structural 
Information 



15 



Contig 
Accession 
NT 008769 



Contig 
Position 
823719 



dbSNPrs# 
(Cluster ID) 
rs1058930 



NT_008769 823719 rs1058930 
NT_008769 823719 rsl 058930 
20 NT 008769 823719 rsl 058930 



Protein 


dt>SNP 


Protein 


Codon 


Amino 


Accession 


Aliele 


Residue 


Position 


Acid 


XP_011938 


c 


1 


3 


264 




g 


M 






XP_050924 


c 


1 


3 


67 




g 


M 






XP_050926 


c 


i 


3 


251 




g 


M 






XP_050929 


c 


i 


3 


264 




fl 


M 
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Table 15 



10 



15 



20 



Gene 
(Synonyms) 
CYP2C9 

"cytochrome 
P450. 

subfamily IIC 
(mephenytoin 
4-hydroxylase), 
polypeptide 9" 
(CPC9, 
CYP2C10, 
P450IIC9. 
P450 MP-4, 
P450 PB-1) 



DNA Variant 
? 

7 



Locus 



10q24 



Accession #s 

NT_008769 
(working draft 
chromolO) 

NM„000771 (m) 
NP_000762 (p) 



mRNA/ 
Protein 

1835 bp 
490 aa 



Structural 
information 



Allelic Variants from Scientific Literature: 



Protein 
Variant 
Arg144Cy 
s 

lle359Leu 



Phenotype 

War^rin Sensitivity 

Poor tolbutamide 

metabolism 
(diabetes meilitus) 



References 
OMil^ 601129 
OMIM 601129 



Aiieilc Variants from SNP Database: 



Contig 
Accession 
NT 008769 



Cohtig 
Position 
43400 



dbSNPrs# 
(Cluster ID) 
rsl 057910 



Protein 
Accession 
XP 050915 



NT_008769 43402 rs1 057909 XP_050915 



dbSNP Protein Codon 

Allele Residue Position 

a I 1 

c L 

a Y 2 

g C 



Amino 
Acid 
21 

20 
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Table 16 



10 



15 



Gene 
(Synonyms) 
CYP2C19 

"cytochrome 
P450. 

subfamily IIC 

(mephenytoin 

4-hycIroxy!ase), 

polypeptide 19" 

(CPCJ. 

CYP2C. 

P450C2C, 

P450IIC19) 



DNA Variant 



Locus 

10q24.1- 
q24.3 



Accession #s 

NT_008769 
(working draft 
cliromolO) 
M61854 (m) 
NM_000769 (m) 
NP„000760 (p) 



mRNA/ 
Protein 

1473 bp 
490 aa 



Allelic Variants from Scientific Literature: 



Structural 
Inforrnation 
arg433-to-trp 
mutation in the 
heme-binding 
region 

Ibeanu et ai 
(1998) 



Protein 
Variant 
Arg433Tr 
P 



Phenotype 

IVIephenytofn 4- 
Hydroxylase defect, 
Door metabolizer 



References 
OUM 124020 



Table 17 



20 



25 



30 



35 



Gene 
(Synonyms) 
CYP2D6 

"cytochrome 
P450. 

subfamily II D 

(debrlsoquine, 

sparteine, etc., 

-metabolizing), 

polypeptide 6" 

(CPD6, 

CYP2D, 

CYP2D@, 

P450C2D. 

P450-DB1) 



DNA Variant 



Locus 



Accession #s 



22q13.1 M33388 



NM_000106 (m) 
NP_000097 (p) 



mRNA/ 
Protein 

1655 bp 
497 aa 



Allelic Variants from Scientific Literature: 



Structural 
Information 



Protein Phenotype 
Variant 

Giy169Ter Debrisoquine, poor 
dmg metabolizer 



References 
OMIIV1 124030 



Allelic Variants from SNP Database: 



wo 03/027264 
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Contig Contig dbSNPrs# 
Accession Position (Cluster ID) 
NT_011520 2165135 rs21 03556 
9 

5 NT_011520 2165146 rs2070905 
3 

NT_011620 2165168 rs2070907 
6 

NT_011520 2165224 rsl 065569 
9 

10 

NT_011520 2165227 rsl 974456 
5 

NT_011520 2165263 rs1800754 
1 

15 NT_011520 2165266 re1058171 
2 

NT_011520 2165266 rsl 0581 70 
4 

NT_011520 2165306 rsl 0581 67 
3 

20 

NT_011520 2165135 re21 03556 
9 

NT__011520 2165146 re2070905 
3 

25 NT_011520 2165168 rs2070907 
6 

NT_011520 2165224 rs1065569 
9 

NT.011520 2165227 rs1974456 
5 

30 

NT_011520 2165135 rs2103556 
9 

NT_011520 2165146 rs2070905 
3 

35 NT.011520 2165168 rs2070907 
6 

NT_011520 2165224 rs1065569 
9 



- 43 - 



nroiein 


aDoiNh' 


Kroiein 


Codon 


AfTiinc 


Accession 


Allele 


Residue 


Position 


Acid 


XP_013013 


c 


T 


2 


396 




g 


S 






XP_013013 


g 


M 


3 


361 




a 


1 






XP_013013 


a 


K 


1 


320 




9 


E 






XP_013013 


g 


V 


1 


284 




a 


M 






XP_013013 


g 


R 


2 


275 




a 


H 






XP_013013 


c 


S 


2 


221 




t 


L 






XP_013013 


a 


N 


1 


211 




g 


D 






XP_013013 


g 


G 


2 


210 




c 


A 






XP_013013 


c 


P 


2 


141 




t 


L 






XP_040060 


c 


T 


2 


140 




g 


S 






XP_040060 


g 


M 


3 


105 




a 


1 






XP„040060 


a 


K 


1 


64 




g 


E 






XP_040060 


g 


V 


1 


28 




a 


M 






XP_040060 


g 


R 


2 


19 




a 


H 






XP_040062 


c 


T 


2 


140 




g 


S 






XP_040062 


g 


M 


3 


105 




a 


1 






XP_040062 


a 


K 


1 


64 




g 


E 






XP_040062 


g 


V 


1 


28 




a 


M 
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NT_011520 2165227 rs1974456 XP_040062 g R 2 19 
5 

NT_011520 2165135 rs2103556 XP_040064 c T 2 180 
9 

5 NT_011520 2165146 rs2070905 XP_040064 g M 3 145 
3 

NT_011520 2165168 r82070907 XP_040064 a K 1 104 
6 

NT_011520 2165224 rsl 065569 XP_040064 g V 1 68 
9 

10 

NT_011520 2165227 re1 974456 XP_040064 g R 2 59 
5 

NT_011520 2165263 rs1 800754 XP_040064 c S 2 5 
1 

15 NT_011520 2165135 rs2103556 XP_040065 c T 2 227 
9 

NT_011520 2165146 rs2070905 XP_040065 i M 3 192 
3 



NT_011520 2165168 rs2070907 XP_040065 a K 1 151 

6 

20 

NT_011520 2165224 rsl 065569 XP_040065 g V 1 115 

9 

a M 

NT_011520 2165227 re1974456 XP_040065 g R 2 106 
5 

25 NT_011520 2165263 rs1800754 XP 040065 c S 2 52 
1 

NT_011520 2165266 rsl 0581 71 XP 040065 a N 1 42 

2 

NT_011520 2165266 rsl 0581 70 XP_040065 g G 2 41 
4 

30 

NT_011520 2165135 rs2103556 XP_040066 c T 2 396 
9 

NT_011520 2165146 rs2070905 XP_Q40066 g M 3 361 
3 

35 NT_011520 2165168 rs2070907 XP_040066 a K 1 320 

6 

NT_011520 2165224 rsl 065569 XP_040066 g V 1 284 
9 



9 


R 


2 


a 


H 




c 


T 


2 


a 


s 




9 


M 


3 


a 


1 




a 


K 


1 




E 




9 


V 


1 


a 


M 




g 


R 


2 


a 


H 




c 


S 


2 


t 


L 




c 


T 


2 


n 

9 


s 




g 


M 


3 


a 


1 




a 


K 


1 


Q 


E 




g 


V 


1 


a 


M 




g 


R 


2 


a 


H 




c 


S 


2 


t 


L 




a 


N 


1 


g 


D 




g 


G 


2 


c 


A 




c 


T 


2 


g 


S 




g 


M 


3 


a 


1 




a 


K 


1 


g 


E 




g 


V 


1 
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NT_011520 2165227 rs1974456 XP_040066 
5 

NT_011520 2165263 rs1 800754 XP_040066 
1 

5 

NT 011520 2165266 re1058171 XP_040066 
2 

NT_011520 2165266 rs1058170 XP_040066 
4 

10 NT 011520 2165305 re1058169 XP 040066 



NT 011520 2165306 rs1058167 XP_.O4O066 
3 



a 


M 






g 


R 


2 


275 


a 


H 






c 


S 


2 


221 


t 


L 






a 


N 


1 


211 


g 


D 






g 


G 


2 


210 


c 


A 






c 


H 


3 


142 


t 


H 






c 


P 


2 


141 


t 


L 







15 



20 



25 



Allele Protein Nucleotid Trivial 
ejchange name 
s 



Allelic variants from Karollnska Institute: 

Effect Enzyme activity 



CYP2D6*1 CYP2D6.1 None Wild-type 
A 

CYP2D6*2 CYP2D6.2 -1 584CG; - CYP2D6L R296C: 



1235AG; - 
740CT; - 
678GA; 
1661GC; 
2850GT; 
4180GG 



CYP2D6*2 CYP2D6,2 1039CT; 
B 1661GC; 

2850CT; 
4180GC 

CYP2D6*2 CYP2D6.2 1661GC; 
C 2470TC; 

2850CT; 
4180GC 

CYP2D6*2CYP2D6.2 2850CT; 

D 4180GC 
CYP2D6*2CYP2D6.2 997CG; 
E 1661GC; 

2850CT; 
4180GC 

CYP2D6*2 CYP2D6.2 1661GC; 
F 1724CT; 

2850CT; 
4180GC 



S486T 



M10 
M12 

M14 



R296C; 
S486T 



R296C; 
S486T 



R296C; 
S486T 
R296C; 
S486T 



R296C; 
S486T 



in vivo 
Normal 

Normal 
(dx,d,s) 



In vitro 
Nomnal 



Reference 
s 



Kimura et 
al, 1989 
Johansson 
et 31.1993 
Panserat 
etal,1994 
Ralmundo 
et al, 2000 
See also 
comment 
below the 

table. 
Marez et 
al. 1997 



Marez et 
al, 1997 
Sachse et 
al, 1997 
Marez et 
al. 1997 
Marez et 
al. 1997 



Marez et 
al. 1997 
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MIR 
ivi 10 






• 


ivicii 0^ 


G 


Z47U I o; 




0400 1 






































41oOGC 












CYr2D6 2 CYP2DD.2 


1dd1GC; 






• 




iviarez ei 


H 


248UCT, 




O40D 1 






9I 1QQ7 




2ooOCT, 


























CYP2D6*2 CYP2D6.2 


1661GC; 


M18 


R29dC; 


• 




Marez et 


J 


2850CT; 




Oil OCT" 

S4ooT 






ai hoot 
ai, 199/ 




2939GA; 














A A orvo^ 
41oOGC 












CYP2D6 2 CYP2D6.2 


A CO A • 
I60IGC; 


M21 


R296C; 


• 


• 


iviarez ei 


K 


2850CT; 




S486T 










AAA e/^T*. 

411 5CT; 














4180GC 












CYP2D6*2 CYP2D6.2 


1661GC; 




R296C; 


incr 


• 


jonansson 


XN 


2850CT; 




S486T 


(a) 




ei ai, m^o 


(N=2. 3, 4, 


4180GC 




N active 






Uani et ai, 


5 or 13) 






genes 






I99D 












AKiiiiu ei ai, 
















CYP2D6*3 


2549Aael 


CYP2D6A Frameshift 


None 


None 


rvagimolo 


A 








(a. s) 




et ai, lyyu 


CYP2D6*3 


Af Af\A 

1749AG; 




N166D; 


• 


• 


iviarez ei 


B 


2549Adel 




frameshift 






al, lyy/ 


CYP2D6*4 


100CT; 


CYP2D6B 


P34S; 


None 


None 


tsagimoto 


A 


974CA; 




L91M; 


(a, s) 


(D) 


et ai, lyyu 




9o4AG;_9 




H94R; 






oougn ei 




97CG; 




Splicing 






at, lyyu 




1661GC; 




defect; 






tianioKa et 




1846GA; 




S486T 






ai, nyyu 




4180GC 












CYP2D6 4 


100CT, 


CYP2D6B 


P34S; 


None 


None 


l^oninnfttA 

ixagirnolu 


B 


974CA; 




L91M; 


lA t>\ 

(d, s) 


(D) 


At 41 •iQon 
et ai, iyyu 




984 AG; 




H94R; 










997CG; 




Splicing 










1846GA; 




defect; 










4I8OGC 




S486T 








CYP2D6M 


100CT; 


K29-1 


P34S; 


None 


• 


V/^l^/^tA At 

YOKOia et 


C 


1661GC; 




Splicing 






aii lyyo 




1846GA; 




defect; 










3887TC: 




L421P; 










4180GC 




S486T 








CYP2D6*4 


100CT: 




P34S; 


None (dx) 




Marez et 


D 


1039CT; 




Splicing 






aU997 




1661GC; 




defect; 










1846GA; 




S486T 










4180GC 












CYP2D6M 


100CT; 




P34S; 






Marez et 


E 


1661GC: 




Splicing 






al, 1997 




1846GA; 




defect; 










4180GC 




S486T 
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CYP2D6M 
F 



CYP2D6*4 
G 



5 CYP2D6M 
H 



CYP2D6M 
J 



CYP2D6M 
10 K 



CYP2D6M 
L 



CYP2D6M 
X2 



15 CYP2D6*5 



CYP2D6*6 
A 

CYP2D6*6 
B 



luocT; 


P34S; 


874CA; 


L91M; 


no A A 

9o4AG; 


H94R; 


997CG; 


Splicing 


A CCA 

1661GC; 


defect; 


A OAdr^ A . 

1 o4oGA; 


R173C; 


1 oOOU 1 , 


Oil OC¥* 


4180GC 




1 0OCT; 


P34S; 


974CA; 


L91M; 


984AG; 


H94R; 


997CG; 


Splicing 


1661GC; 


defect; 


1846GA; 


P325L; 


^yooCT; 


S486T 


4180GC 




100CT; 


P34S; 


974CA; 


L91IVI; 


yo4AC3, 


H94R; 


997CG; 


Splicing 


1661 GC; 


defect; 


1846GA; 


E418Q; 


3877GC; 


S486T 


A A Qf\r*^ 

41oOGC 




100CT; 


P34S; 


974CA; 


L91M; 


984AG; 


H94R; 


997CG; 


Splicing 


1661 GC; 


defect 


1846GA 




100CT; 


P34S; 


loolGC; 


Splicing 




defect; 


2850CT; 


R296C; 


4180GC 


S486T 


100CT; 


P34S; 


997CG; 


Splicing 


1661GC; 


defect; 


1846GA; 


S486T 


4180GC 

• • 




GYP2D6 CYP2D6D 


CYP2D6 


deleted 


deleted 



Marez et 
al. 1997 



None 



None 



None 
(d,s) 



1707Tdel CYP2D6T Frameshift None 

(d.dx) 

1707TdeI; Frameshift; None 

1976GA G212E (s, d) 



Marez et 
al, 1997 



IVIarez et 
al. 1997 



Marez et 
al.1997 



Saciise et 
al, 1997 



Submitted 
17-Aug-OO 

by Dr. T. 

Shimada 

L0vlie et 
al. 1997 
Sachse et 
al. 1998 
Gaedigk et 
al, 1991 
Steen et al, 

1995 
Saxena et 
al. 1994 
Evert et al, 

1994 
Daly et al, 
1995 
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20 



25 



CYP2D6*6 
C 

CYP2D6*6 
D 



1707TcleI; 
1976GA; 
4180GC 
1707TdeI; 
3288GA 



Frameshift; None (s) 

G212E; 

S486T 
Frameshift; 

G373S 



GYP2D6*7CYP2D6.7 2935AC CYP2D6E H324P 

CYP2D6*8 . 1661 GC; CYP2D6G Stop 
1758GT; codon; 
2850CT; R296C; 
4180GC S486T 

CYP2D6*9 CYP2D6.9 2613- CYP2D6G K281del 
2615delAG 
A 



CYP2D6*1 CYP2D6.1 
OA 0 

10 CYP2D6*1 CYP2D6.1 
OB 0 



CYP2D6*1 
OC 



100CT; 
1661GC; 
4180GC 

100GT; 
1039CT; 
1661GC; 
4180GC 



CYP2D6J 



CYP2D6C 
h1 



P34S; 
S486T 

P34S; 
S486T 



None 

(s) 
None 
(d.s) 



Deer 
(b,s,d) 



Deer 
(s) 

Deer 



Marez et 
al, 1997 

Marez et 
al, 1997 
Evert et al, 

1994 
Broly et al. 

1995 



Deer Tyndale et 
(b.s.d) al, 1991 
Broly & 
Meyer, 
1993 
Yokota et 
al, 1993 

Deer Johansson 
(b) etal,1994 



see CYP2D6*36 



CYP2D6*1 




883GC; 


CYP2D6F 


Splicing 


None 


Marez et 


1 




1661GC; 




defect; 


(s) 


al. 1995 






2850CT; 




R296C; 










4180GC 




S486T 






CYP2D6*1 CYP2D6.1 


124GA; 




G42R;; 


None 


Marez et 


2 


2 


1661GC: 




R296C; 


(s) 


al, 1996 






2850CT; 




S486T 










4180GC 










CYP2D6*1 




CYP2D7P/ 




Frameshift 


None 


Panserat 


3 




CYP2D6 






(dx) 


et al, 1995 






hybrid. 














Exon 1 














CYP2D7. 














exons 2-9 














CYP2D6. 










CYP2D6*1 CYP2D6.1 


100CT; 




P34S; 


None 


Wang, 


4 


4 


1758GA; 




G169R; 


(d) 


1992 






2850CT: 




R298C; 




Wang et al, 






4180GC 




S486T 




1999 


CYP2D6*1 




138insT 




Frameshift 


None 


Sachse et 


5 










(d, dx) 


al, 1996 


CYP2D6*1 




CYP2D7P/ CYP2D6D Frameshift 


None 


Daly et al, 


6 




CYP2D6 


2 




(d) 


1996 



hybrid. 
Exons1-7 
CYP2D7P- 

related, 
exons 8-9 
CYP2D6. 
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10 



15 



20 



30 



CYP2D6*1 CYP2D6.1 


1023CT; 




11 071; 


7 7 


1638GC: 




K29DLr; 




2850CT: 




Oil OCT 




4180GC 






CYP2D6*1 CYP2D6.1 


4125- 


CYP2Dd(J 


468- 


8 8 


4133insGT 




470VPT 


GCCCACT 


ins 


CYP2D6*1 


1661GC; 


• 


Frameshift; 


9 


2539- 




K29oC; 




2542delAA 




S486T 




CT; 








2850CT; 








4180GC 






CYP2D6*2 


1661GC; 


• 


Frameshift 


0 


1973lnsG; 




; L213S; 




1978CT; 




R296C; 




1979TC; 




S486T 




2850CT; 








4180GC 






CYP2D6*2 CYP2D6.2 
1 1 


77GA 


Ml 


R26H 


CYP2D6*2 CYP2D6.2 


82CT 


M2 


R28C 


2 2 








CYP2D0 2 CYP2D6,2 


957CT 


M3 


A85V 


3 3 








CYP2D6*2 CYP2D6.2 


2853AC 


M6 


I297L 


4 4 








CYP2D6 2 CYP2D6.2 


3198GG 


M7 


R343G 


5 5 








CYP2D6 2 CYP2D6.2 


3277TC 


Ik ilO 

MS 


I369T 










CYP2D6*2CYP2D6.2 


3853GA 


MS 


c410K 


7 7 








CYP2D6*2CYP2D6.2 


19GA: 


Mil 


V7M; 


8 8 


1661GG; 




Q151E; 




1704CG; 




R296C; 




2850CT: 




S48oT 




4180GC 






CYP2D6*2 CYP2D6.2 


1659GA; 


M13 


V136M; 


9 9 


1661GC; 




R296C; 




2850CT; 




V338M; 




3183GA; 




S486T 




4180GC 






CYP2D6*3CYP2D6.3 


1661GC; 


M15 


172- 


0 0 


1863 ins 




174FRP 




9bp rep; 




rep; 




2850CT: 




R296C; 




4180GC 




S486T 


CYP2D6*3 CYP2D6.3 


1661GC: 


M20 


R296C; 


1 1 


2850CT; 




R440H; 




4042GA; 




S486T 




4180GC 







Deer 


Deer 


Masimirem 


(d) 


(b) 


bwa et al, 






1996 






Osearson 






etal, 1997 


None (s) 


Deer (b) 


Yokoj et al, 


1996 


None 




Marez et 






a!, 1997 



Marez- 
Allorge et 
al, 1999 



Marez et 
al. 1997 
Marez et 
al, 1997 
Marez et 
al, 1997 
Marez et 
al. 1997 
Marez et 
al, 1997 
Marez et 
al, 1997 
Marez et 
al. 1997 
Marez et 
al, 1997 



Marez et 
a!, 1997 



Marez et 
al, 1997 



Marez et 
al, 1997 
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CYP2D6*3 CYP2D6 
2 2 



CYP2D6*3 CYP2D6, 

3 3 
5 CYP2D6*3CYP2D6. 

4 4 
CYP2D6*3 CYP2D6. 

5 5 



CYP2D6*3 CYP2D6 
10 5X2 5 



CYP2D6*3 CYP2D6 
6 6 



CYP2D6*3CYP2D6 
7 7 



15 CYP2D6*3 
8 



CYP2D6*3 CYP2D6 
9 '9 



3 1661GC; M19 

2850CT; 

3853GA; 

4180GC 
3 2483GT CYP2D6*1 
C 

3 2850CT CYP2D6M 
D 

,3 31 GA; CYP2D6*2 
1661GC; B 
2850CT; 
4180GC 
.3 31 GA; 
1661GC; 
2850CT; 
4180GC 
.3 100CT; CYP2D6C 
1039CT; h2 
1661GC; 
4180GC; 

gene 
conversion 
to CYP2D7 
In exon 9 
.3 10GCT; CYP2D6M 
1039CT; CD 
1661 GC; 
1943GA: 
4180GC; 

2587- N2 
2590delGA 
CT 
3 1661GC; 
4180GC 



20 



CYP2D6M CYP2D6.4 ld23CT; 
0 0 1661GC; 

1863 
ins(TTT 
CGC 
CCC)2; 
2850 CT; 
4180GC 

CYP2D6M CYP2D6.2 -1235AG; 
^ 1 -740CT;- 
678GA; 
1661GG; 
2850CT; 
4180GC 



R296C; 
E410K; 
S486T 

A237S Nonnal (s) 
R296C 

V11M; Nonnal (s) 

R296C; 

S486T 

VII M; Incr 

R296C; 

S486T 

P34S; Deer Deer 
S486T (d) (b) 



P34S; 
R201H; 
S486T 



Frameshift None 
S486T 



T107I; None(dx) 
172- 
174(FRP)3 
; R296C; 

S486T 



R296C; Deer (s) 
S486T 



b, bufuralol; d, debrisoquine; dx, dextromethorphan; s, sparteine 
SwissProtGenBankOMIMGeneCards 



Marez et 
al, 1997 



Marez et 
al, 1997 
Marez et 
al, 1997 
Marez et 
al, 1997 



Griese et 
al, 1998 



Wang, 
1992 
Johansson 
etal, 1994 
Leathart et 

aM998 



Marez et 
al, 1997 



Leathart et 
aM998 

Submitted 
17-Aug-OO 

by Dr. T. 

Shimada 
Submitted 
28-Feb-01 

by Dr. A. 

Gaedigk 



Raimundo 
et al, 2000 
This allele 
is being 
further 
charaeteris 
ed. 



2 5 Note: The -1 584CG; -1 235AG; -740CT and -678GA polymorphisms are probably found in 
most alleles of the CYP2D6*2 series (Raimundo et al. 2000). 
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Table 18 



10- 



Gene 
(Synonyms) 
CYP4A11 

"cytochrome 
P450, 

subfamily IVA, 
polypeptide 11" 
(CP4Y. 
CYP4A2. 
CYP4AII) 



Locus Accession #s 

NT_029224 
(working draft 
chromol) 

NM_000778 (m) 
NP_000769 (p) 



mRNA/ 
Protein 

2815 bp 
519aa 



Structural 
information 



Allelic Variants from SNP Database: 



15 



Contig 
Accession 
NT 029224 



Contig 
Position 
405284 



dbSNPrs# 
(Cluster ID) 
rs2056899 



Protein 
Accession 
XP 037166 



NT_029224 406350 rs2056900 XP 037166 



dbSNP Protein Codon Amino 

Allele Residue Position Acid 

a N 1 48 

t Y 

9 G 1 26 

a S 
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Gene 
(Synonyms) 
CYP4F2 

"cytochrome 
P450, 

subfamily IVF, 
polypeptide 2" 
(CPF2) 



Locus 

19pter- 
p13.11 



Table 19 



Accession #s 



U02388 (m) 
NM 001082 (m) 
NP_001073 (p) 



mRNA/ 
Protein 

2360 bp 
520 aa 



Structural 
Information 



NT_011281 
(working draft 
chromo19) 
NT_025130 
(working draft 
chromo19) 



Allelic Variants from SNP Database: 

Contig Contig dbSNPrs# Protein dbSNP Protein Codon Amino 
Accession Position (Cluster ID) Accession Allele Residue Position Acid 
NT 011281 77228 rs2108622 XP_051256 9 V 1 433 

a M 



15 



Table 20 



20 



25 



30 



Gene 
(Synonyms) 
CYP11A 

"cytochrome 
P450. 

subfamily XIA 

(cholesterol 

side chain 

cleavage)" 

(P450SCC, 

cytochrome P4 

50C11A1) 



Locus 

16q23. . 
q24 



Accession #s 
D00169 

NT_010298 (g) 
M14565 (m) 
NM_000781 (m) 
NP_000772 (p) 



mRNA/ 
Protein 

1821 bp 
521 aa 



Allelic Variants from SNP Database: 



Structural 
information 



Contig 
Accession 
NT 010298 



Contig 
Position 
262118 



dbSNPrs# 
(Cluster ID) 
rsl 130841 



Protein 
Accession 
XP 007646 



NT_010298 281261 rsl 049968 XP_007646 



35 NT_010298 281298 rs6161 
NT 010298 281298 rs6161 



XP 007646 



XP 027406 



dbSNP 
Allele 

g 

a 

c 

g 
g 

a 

g 

a 



Protein Codon Amino 

Residue Position Acid 

C 2 16 

Y 

I 3 301 

M 

E 1 314 
K 

E 1 4 

K 
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Table 21 



10 



Gene 
(Synonyms) 
CYP11B1 

"cytochrome 
P450. 

subfamily XIB 
(steroid 11- 
beta- 

hydroxylase), 
polypeptide 1" 
(FHI, CPN1. 
CYP11B. 
P450C11) 



15 



DNA Variant 
? 

? 

? 



20 



mRNA/ 
Protein 

2092 bp 
503 aa 



Locus Accession #s 

8q21 D10169, D90428, 

X55765 (exoni and 
5' flanldng region) 
D16153 (exon 1 and 
2 normal) 
M32863, J05140 
(axon 1 and 2) 
M32878 (exon 3-8) 
D16154 (exon 3-9) 
M32879 (exon 9) 
NT_008127 
(working draft 
chromed) 

NT_008127 (g) 
X55764 (m) 
NM_000497 (m) 
NP_000488 (p) 



Allelic Variants from Scientific Literature: 



Structural 
Information 



Protein 
Variant 
Pro42Ser 



Thr319Me 
t 

Asn133Hi 
s 

Arg374GI 
n 

Thr318Me 
t 



CGC > CAC Arg448Hls 



Phenotype 

Steroid 11-Beta- 
hydroxylase 
deficiency 
Steroid 11 -Beta- 
hydroxylase 
deficiency 
Steroid 11-Beta- 
hydroxylase 
deficiency 
Steroid 11-Beta- 
hydroxylase 
deficiency 
Steroid 11-Beta- 
hydroxyiase 
deficiency 
Steroid 11-Beta- 
hydroxylase 
deficiency 



References 
OIVIIM 202010 

OMIM 202010 

OMIM 202010 

OIVIIM 202010 

OMIM 202010 

OMIM 202010 



Allelic Variants from SNP Database: 



25 



Contig 
Accession 
NT 008127 



Contig 
Position 
147509 



dbSNPrs# 
(Cluster ID) 
rs5294 



NT_008127 147747 rs4541 



Protein 
Accession 
XP_030748 

XP 030748 



dbSNP Protein Codon 
Allele Residue Position 

t Y 1 

c H 

c A 2 

t V 



Amino 

Acid 
439 

386 
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10 



20 



NT_008127 


147756 


rs5312 


XP_030748 


a 
t 


E 
V 


2 


383 


NT_008127 


148261 


rs6407 


XP_030748 


g 

a 
c 

g 
g 


A 

T 


1 


348 


NT_008127 


148788 


rs5292 


XP_030748 


L 

V 


1 


293 


NT_008127 


148823 


rs5291 


XP_030748 


S 

N 

F 
1 


2 


281 


NT_008127 


149180 


rs5288 


XP_030748 


a 
t 


3 


257 


NT_008127 


149208 


re4547 


XP_030748 


9 
c 
t 
a 
c 


L. 

T 
1 


2 


248 


NT_008127 


149286 


rs5308 


XP_030748 


N 
T 


2 


222 


NT_008127 


149608 


rs5287 


XP_030748 


9 
1 


M 


3 


160 


NT_008127 


152097 


rs5282 


c 

XP_030748 


9 
c 


D 
H 
R 
Q 


1 


63 


NT_008127 


152156 


rs4534 


XP_030748 


g 

a 


2 


43 


NT_008127 


152255 


rs6405 


XP_030748 


g 

a 


C 
Y 


2 


10 



Table 22 



25 



30 



35 



40 



Gene 
(Synonyms) 
CYP11B2 

"cytochrome 
P450, 

subfamily XIB 
(steroid 11- 
beta- 

hydroxylase), 

polypeptide 2" 

(CPN2, 

CYP11B, 

CYP11BL, P- 

450C18, 

P450aldo) 



DNA Variant 
? 

? 
? 
? 



Locus Accession #s 

8q21-q22 D13752 

X54741 (m) 
NM_000498 (m) 
NP_000489 (p) 



mRNA/ 
Protein 

2936 bp 
503 aa 



Allelic Variants from Scientific Literature: 

Plienotype 



Structural 
Information 



Protein 
Variant 
Lys173Ar 



Glu198As 
P 

Thr185iie 

Leu461Pr 
o 



Low renin, 
susceptibility to 
tiypertension 
Congenltai 
tiypoaldosteronlsm 

Congenital 
hypoaldosteronism 

Congenital 
hypoaldosteronism 



References 
OMIM 124080 

OMil^ 124080 
OMIM 124080 
OMIM 124080 



wo 03/027264 



PCT/US02/31180 



- 55 - 



GTG>GCG Val386Ala Congenital OMIM 124080 

hypoaldosteronism 

CGG > TGG Arg181Tr Congenital OMIM 124080 
E hypoaldosteronism 



Table 23 



10 



15 



20 



25 



Gene 
(Synonyms) 
CYP17 

"cytochrome 
P450, 

subfamily XVII 
(steroid 17- 
alpha- 

hydroxylase), 
adrenal 
hyperplasia" 
(CPT7, S17AH. 
P450G17) 



Locus 

10q24.3 



Accession #s 

M19489 

NT_029393 (g) 
M14564 (m) 
NM_000102 (m) 
NP_000093 (p) 



mRNA/ 
Protein 

1755 bp 
508 aa 



Structural 
Information 



Allelic Variants from Scientific Literature: 



DNA Variant 


Protein 


Phenotype 


References 




Variant 






T>G 


PHE417C 


Alpha- 


OMIM 202110 




YS 


hydroxylase/17,20- 








lyase deficiency 




G>A 


ARG358G 


Alpha- 


OMIM 202110 




LN 


hydroxylase/17,20- 








lyase deficiency 




G>A 


ARG347H 


Alpha- 


OMIM 202110 




IS 


hydroxylase/17,20- 








lyase deficiency 




CGG > TGG 


ARG96TR 


Alpha- 


OMIM 202110 




P 


hydroxylase/17,20- 








lyase deficiency 




CCA > ACA 


PR0342T 


Alpha- 


OMIM 202110 




HR 


hydrDxylase/17,20- 




CGA>TGA 




lyase deficiency 




ARG239T 


Alpha- 


OMIM 2021 10 




ER 


hydroxylase/17,20- 








lyase deficiency 






SER106P 


Adrenal hyperplasia 


OMIM 202110 




RO 






Allelic Variants from SNP Database: 





Contig Contig^ dbSNPr^ Protein 
Accession Position (Cluster ID) Accession 



dbSNP Protein Codon 
Allele Residue Position 



Amino 
Acid 



wo 03/027264 



PCT/US02/31180 



- 56 - 



NT„029393 754865 rs762563 XP_005915 c C 3 22 
g W 



Table 24 



10 



15 



20 



25 



Gene 
(Synonyms) 
CYP19 

"cytochrome 
P450. 

subfamily XIX 
(aromatization 
of androgens)" 
(ARO. AR01, 
CPV1,CYAR, 
P^50AROM) 



mRNA/ 
Protein 

3007 and 3116 

bp 

503 aa 



Locus Accession #s 

15q21.1 L21982(gene, 

untranslated exon 
1.4) 

NT_010204 
(working draft 
chromo15) 

NM_000103 (m) 
NMJ31226 (m) 
NP_^000094 (p) 
NPJ 12503 (p) 



Allelic Variants from Scientific Literature: 



Structural 
Information 



DNA Variant 


Protein 


Phenotype 


References 




Variant 






C1303T 


Arg435Cy 


Aromatase 


OMIM 107910 




s 


deficiency 




G1310A 


Cys437Ty 


Aromatase 


OMIM 107910 




r 


deficiency 




C1123T 


Arg375Cy 


Aromatase 


OMIM 107910 




s 


deficiency 




G1094A 


Arg365GI 


Aromatase 


OMIM 107910 




n 


deficiency 





Allelic Variants from SNP Database: 



Contig 
Accession 
NT 010204 



Contig 
Position 
1691214 



dbSNPrs# 
(Cluster ID) 
rs2236722 



Protein 

Accession 
XP 035593 



NT_010204 1706104 rs1803154 XP_035593 
NT_010204 1718241 rs700519 XP_036593 



dbSNP 
Allele 

t 

c 

a 

t 

c 

t 



Protein Codon 
Residue Position 



W 

R 

K 
* 

R 
C 



1 



Amino 
Acid 
39 

108 

264 



I 
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Table 25 



10 



15 



20 



Gene 
(Synonyms) 
CYP21A2 

"cytochrome 
P450. 

subfamily XXIA 

(steroid 21- 

hydroxylase, 

congenital 

adrenal 

hyperplasia), 

polypeptide 2" 

(CPS1, 

CA21H, 

CYP21, 

CYP21B, 

P450C21B, 

P450C21B) 



DNA Variant 



Locus 



6p21.3 



Accession #s 

M13936 

NG_000013 (g) 
NT„007592 (g) 
NM_000500 (m) 
M26856 (m) 
NP_000491 (p) 



mRNA/ 
Protein 

2112 bp 
495 aa 



Structural 
Information 



Allelic Variants from Scientific Literature: 



Protein 
Variant 
GIy424Se 
r 

GIu380As 
P 

Arg339Hls 
Met238Ly 
s 

Val236Glu 
lle235Asn 
Tyr102Arg 

Pro453Se 
r 

Gly292Se 
r 

Ser268Th 
r 

Pro30Leu 
Arg356Tr 
P 

Val281Le 
u 

lle172Asn 



Phenotype 

Adrenal Hyperplasia 

Adrenal Hyperplasia 

Adrenal Hyperplasia 
Adrenal Hyperplasia 

Adrenal Hyperplasia 
Adrenal Hyperplasia 
21 -Hydroxylase 
polymorphism 
Adrenal Hyperplasia 

Adrenal Hyperplasia 

21-HydroxyIase 
polymorphism 
Adrenal Hyperplasia 
Adrenal Hyperplasia 



Adrenal Hyperplasia 
Adrenal Hyperplasia 
Allelic Variants from SNP Database: 



References 

OMIM 201910 

OMII\/1 201910 

OMIM 201910 
OWM 201910 

OMIM 201910 
OMIM 201910 
OMIM 201910 

OMIM 201910 

OMIM 201910 

OMIM 201910 

OMIM 201910 
OMIM 201910 

OMIM 201910 

OMIM 201910 



Contig Contig dbSNPrs# Protein 
Accession Position (Cluster ID) Accession 



dbSNP Protein Codon 
Allele Residue Position 



Amino 
Add 
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NT_007592 
NT_007592 
NT_007592 
NT_007592 
5 NT_007592 
NT_007592 
NT_007592 
NT_007592 
NT_007592 
10 NT_007592 
NT 007592 





roD*r f o 


XP 004200 


n 


s 


2 


494 








o 
a 


N 






0ZUU900 




Ar U*» ^UU 




p 


1 


454 








f 

L 


o 








r5D*f# 1 




9 


V 


1 


282 








f 

L 


1 










YP nn4.9nn 

Ai _^UU*T^UU 


ft 


o 


Cm 


269 










T 
1 








rso*ff O 




t 

L 


M 

IVI 


o 

c. 


240 








a 
a 


K 
i\ 






8202414 


rs1040310 


XP_004200 


C 


D 
E 


3 


184 


8202536 


rs6475 


XP_004200 


g 
t 


1 


2 


173 








a 


N 






8202853 


rs6474 


XP_004200 


9 


R 


2 


103 








a 


K 






8200835 


rs6473 


XP_042400 


g 


S 


2 


225 








a 


N 






8200956 


rs6445 


XP_042400 


c 


P 


1 


185 








t 


S 






8201852 


rs6471 


XP_042400 


g 


V 


1 


13 








t 


L 







Table 26 



15 



20 



25 



30 



Gene 
(Synonyms) 
CYP27A1 

"cytochrome 
P450. 
subfamily 
XXVIIA (steroid 
27- 

hydroxylase, 

cerebrotendlno 

us 

xanthomatosis) 
, polypeptide 1" 
(CTX, CP27, 
CYP27) 



DNA Variant 
G>A 
OT 



Locus 



2q33-qter 



Accession #s 

S62709 (5' region) 
NT_005289 
(working draft 
chromo2) 

NIVI_000784 (m) 
NP_000776 (p) 



mRNA/ 
Protein 

2059 bp 
531 aa 



Allelic Variants from Scientific Literature: 



Structural 
Information 

Call et al 
(1991) 

OlVlliVI (213700 
One mutation, * 
caiied by them 
CTX1, was at 
codon 446 near 
the heme 
ligand, cys444. 
The second, 
called CTX2. 
was at codon 
362 in the 
adrenodoxin 
binding region. 



Protein 
Variant 
Arg372GI 
n 

Arg441Tr 
P 



Phenotype 

Cerebrotendinous 
xanthomatosis 

Cerebrotendinous 
xanthomatosis 



References 
OMIM 213700 
OMIM 213700 
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G-to-A 

(CGPy)to 
cysteine 
codons 
(TGPy). 

(CGPy) to 
cysteine 
codons 
(TGPy). 



Arg441GI 
n 

Arg362Cy 



Arg446Cy 
s 



Cerebrotendinous 
xanthomatosis 

Cerebrotendinous 
xanthomatosis 



Cerebrotendinous 
xanthomatosis 



OWM 213700 
OMIM 213700 

OI^IM 213700 



10 



Table 27 



15 



20 



25 



Gene 
(Synonyms) 
CYP51 

"cytochrome 
P450. 51 
(lanosteroi 14- 
alpha- 

demethylase)" 
{LDIV1,CP51, 
CYPL1, 
P450L1, P450- 
14DM) 



Locus 

7q21.2- 
q21.3 



Accession #s 

AH006655 

NT_029333 (g) 
NM_000786 (m) 
NP_000777 (p) 



mRNA/ 
Protein 
3381 bp 



Structural 
Information 



Allelic Variants from SNP Database: 



Contig Contig dbSNPrs# Protein 
Accession Position (Cluster ID) Accession 
NT_029333 2609660 rs2229188 XP 004663 



dbSNP Protein Codon Amino 
Allele Residue Position Acid 
t V 2 19 
c A 
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Table 28 



10 



Gene 
(Synonyms) 
EPHX1 

"epoxide 
hydrolase 1 , 
microsomal 
(xenobiotic)" 
{MEH, EPHX) 



DNA Variant 
? 



Locus Accession #s mRNA / 

Protein 

1q42.1 AF253417, L29766, 1856 bp 
L25880 455 aa 

NT_004525 (g) 
NM_000120 (m) 
NP_000111 (p) 

Allelic Variants from Scientific Literature: 

Ptienotype 



Structural 
Information 



15 


Contig 
Accession 
NT.004525 


Contig 
Position 
1595032 




NT.004525 


1595753 




NT„004525 


1601658 




NT.004525 


1608433 


20 ■ 


NT_004525 


1611484 



Protein 
Variant 

Tyr1 1 3His Epoxide hydrolase 
polymorphism, 
susceptibility to 
aflatoxin B1? 

Allelic Variants from SNP Database; 

dbSNPrs# 



References 
OUm 132810 



Protein 


dbSNP 


Protein 


Codon 


Amino 


Accession 


Allele 


Residue Position 


Acid 


XP_001799 


g 


R 


2 


454 




a 


Q 






XP_001799 


t 


H 


3 


387 




a 


Q 






XP_001799 


9 


R 


2 


139 




a 


H 






XP_001799 


c 


H 


1 


113 




t 


Y 






XP_001799 


c 


R 


1 


49 




t 


C 
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Table 29 



Gene 
(Synonyms) 
EPHX2 

(epoxide 
hydrolase 2, 
cytoplas) 



mRNA/ 
Protein 

2100 bp 
554 aa 



Structural 
Information 



Locus Accession #s 

8p21i3l2 X97024(exon1) 
X97038 (exon 17. 
18 and 19) 
NT_007988 
(working draft 
chromo 8) 

NM_001979 (m) 
L05779 (m) 
NP_001970 (p) 

Allelic Variants from SNP Database: 

Contig Contig dbSNPrs# Protein dbSNP Protein Codon Amino 
10 Accession Position (Cluster ID) Accession Allele Residue Position Acid 
NT_007988 233832 rs751141 XP_005114 g R 2 287 
^ , a Q 



Table 30 



15 



Gene 
(Synonyms) 
GUSB 

(glucuronidase, 
beta) 



20 



25 



DNA Variant 
? 
? 
? 
? 

C1831T 
C1061T 



mRNA/ 
Protein 

2191 bp 
651 aa 



Locus Accession #s 

7q21.11 M65002 (5' end) 
Pseudogene 
AL021368 (BAG 
55C20 on chromoB) 

Ml 51 82 (m) 

Nl\/l_000181 (m) 
NP__000172 (p) 



Allelic Variants from Scientific Literature: 

Protein Phenotype References 

Variant 

Trp446Ter IVIucopolysaccharido 
sis 

Trp507Ter iVIucopolysaccharido 
sis 

Tyr495Cy l\/lucopolysaccharido 

s sis 
Prol 48Se Mucopolysaccharido 

r sis 
ArgS1 1 Tr Mucopolysaccharido 

p sis 
Arg354Val Mucopolysaccharido 
sis 



Structural 
Information 
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C672T 
OT 

OT 



Arg216Tr 
P 

Arg382Cy 
s 

Ala619Val 



Mucopolysaccharido 
sis 

Mucopolysaccharido 

sis 

Mucopolysaccharido 
sis 



Table 31 



10 



15 



20 



25 



Gene 
(Synonyms) 
KCNH2 
"potassium 
voitage-gated 
channei, 
subfamily H 
(eag-related), 
member 2" 
(HERG. LQT2) 



DNA Variant 

G1468A 
? 

? 

G1882A 

G2647A 

T1961G 
A1408G 

C1682T 



mRNA/ 
Protein 

4070 bp 
1159 aa 



Locus Accession #s 

7q35-q36 NT_007704 
(working draft 
chromo7) 

U04270 (m) 
NP_000229 (p) 



Allelic Variants from Scientific Literature: 



Structural 
Information 



Protein 
Variant 
Ala490Thr 
Gly572Ar 

g 

Arg582Cy 
s 

Gly628Se 
r 

Val822Me 
t 

lie593Arg 
Asn470As 
P 

Ala561Val 



Phenotype 

Long QT syndrome 
Long QT syndrome 

Long QT syndrome 

Long QT syndrome 

Long QT syndrome 

Long QT syndrome 
Long QT syndrome 



Long QT syndrome 
Allelic Variants from SNP Database; 



References 

OMIM 152427 
OUm 152427 

0MIIVI1 52427 

OMIM 152427 

OMIM 152427 

OMIM 152427 
OMIM 152427 

OMIM 152427 



Contig 
Accession 
NT 007704 



Contig 
Position 
8393 



dbSNPrs# 
(Cluster ID) 
rs731506 



Protein 
Accession 
XP 004743 



dbSNP Protein Codon 

Allele Residue Position 
t V 2 

0 G 



Amino 
Acid 
41 
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Table 32 



Gene 



Locus Accession #s 



(Synonyms) 
LTA4H 12q22 
"leukotriene A4 
hydrolase" 



mRNA/ 
Protein 

2060 bp 
611 aa 



Structural 
information 



U27293 (exon 19 
and complete cds.) 

NM_000895 (m) 
J03459 (m) 
NP„000886 (p) 

Aileiic Variants from SNP Database: 

Contig Contig dbSNPrs# Protein dbSNP Protein Codon Amino 
Accession Position (Cluster ID) Accession Allele Residue Position Acid 
10 NT_009685 277202 r5l803916 XP_012237 c T 2 600 
, . Q S 



Table 33 



15 



20 



25 



Gene Locus 
(Synonyms) 
PTGIS 20q13.11- 
"prostaglandln q13.13 
12 

(prostacyclin) 
synthase" 
(CYP8, PGIS, 
PTGI, 
CYP8A1) 



Accession #s 

D83393 (exoni) 
NT_011362 
(working draft 
chromo20) 

NP_.000952 (m) 
NP_000952 (p) 



mRNA/ 
Protein 

5605 bp 
500 aa 



Structural 
Information 



Aileiic Variants from SNP Database: 



Contig Contig 
Accession Position 
NT_011362 1317708 
1 

NT_011362 1319336 
3 

NT_011362 1321347 
1 

NT_011362 1321352 
1 

NT_011362 1321702 
0 



dbSNPrs# 


Protein 


dbSNP 


Protein 


Codon 


Amino 


(Cluster ID) 


Accession 


Allele 


Residue Position 


Acid 


rs5584 


XP_030507 


c 


P 


1 


500 






t 


S 






rs5626 


XP_030507 


c 


R 


1 


236 






t 


C 






rs5624 


XP_030507 


t 


F 


1 


171 






c 


L 






rs5623 


XP_030507 


a 


E 


2 


154 






c 


A 






rs5622 


XP_030507 


t 


S 


3 


118 






a 


R 
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Table 34 



10 



15 



20 



Gene 
(Synonyms) 
TPMT 

"thiopurine S- 
methyltransfera 

SB" 



mRNA/ 
Protein 

2742 bp 

245 aa 



Locus Accession #s 

6p22.3' U81562 

NT_007180 (g) 
NM„000367 (m) 
S62904 (m) 
NP„000358 (p) 

Allelic Variants from Scientific Literature: 



Structural 
Information 



DNA Variant 


Protein 


Phenotype 


References 




Variant 






A719G 


Tyr240Cy 


6-mercaptopurine 


OMIM 187680 




s 


sensitivity 




G644A 


Arg215His 


6-mercaptopurine 


0MI1\1 187680 






sensitivity 




G460A 


Ala154Thr 


G-mercaptopurine 


OMIIV1 187680 






sensitivity 




G238C 


AiaSOPro 


6-mercaptopurine 


OMIfVl 187680 






sensitivity 





Allelic Variants from SNP Database: 



Contig Contig 
Accession Position 
NT 007180 151037 



dbSNPrs# 
(Cluster ID) 
rsl 800462 



Protein 

Accession 
XP 012752 



NT 007180 164074 rsl 142345 XP 012752 



dbSNP Protein Codon 

Allele Residue Position 
9 A 1 
c P 
a Y 2 
a C 



Amino 

Acid 
80 

240 



25 



30 



Table references 

Cascorbi etaL, Clh Pharmacol Then 69:169-174 (2001) 
Choi ef a/., Ce// 53:519-529 (1988) 

Hoffmeyer et a/., Proc. Natl. Acad, ScL USA 97:3473-3478 (2000) 

Ito ef a/., Pharmacogenetics 11:175-184 (2001) 

IVIickley ef a/., S/ood 91:1749-1756 (1998) 

Safe ef a/., Proc. Nati Acad. Sci. USA 87:7225-7229 (1990) 

Tanabeef a/., J. Pharmacol. Exp. Then 297:1137-1143 (2001) 

Toh ef a/., Amer. J. Hum. Genet 64:739-746 (1999) 

Wada ef a/., Hum. MoL Genet 7:203-207(1998) 
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Table 35 

BCR-ABL Kinase Domain [\/[utatlons Affecting Response to Imatinib 



Mtifotinn 

iwiuiauon 


Phase of diseass* 


Proposed mechanism of 
resistance 


lVtt44V 


Ur 


impairs confoimational diange 
(Ploop) 




MdU 


impairs conformatlbnal change 
(Ploop) 




IvIdO 


impairs conformational change 
(Ploop) 


TZOOr/n 




Impairs confomiational change 
(Ploop) 




Mdu, LdC, CP, P-MdC 


impairs conformational change 
(P loop) 


lO 101 


hilDP 1 DP PD □ HiDP 

ivlDtr, LbO, Or, r-MBC 


directly affects Imatinib binding 


F317L 


mc. CP 


directly affects imatinib binding 


i\4351T 


MBC. LBC. CP 


Impairs conformational change 
(adjacent to activation loop) 


E355G 


MBC 


impairs confomiational change 
(adjacent to activation loop) 


F359V 


MBC, CP 


directly affects Imatinib binding 


V379I 


CP-NCR 


impairs conformational change 
(activation loop) 


L387M 


CP 


impairs conformational change 
(acth/ation loop) 


H396R 


MBC, CP 


impairs conformational change 
(acth/atlon loop) 



* MBC: meyloid blast crisis 
2 0 LBC: lympjioid blast crisis 

CP: chronic phase 

CP-NCR: chronic phase with hematologic response in the absence of 
cytogenetic response (cytogenetic nonresponder) 



25 



Table 36 
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Beta tubulin (Isoform M40) Mutations Affecting Response to Paclitaxel 



DNA Variant 


Protein Variant 


Phenotype 


Reference 


T810G 


Phe270Val 


paclitaxel 
resistance 


Gianakakou et ai, J. Biol. 
Chem, 272:17118-25 (1997) 


G1092A 


Ala364Thr 


paclitaxel 
resistance 


Gianakakou ef a/., J. Biol. 
Chem. 272:17118-25 (1997) 



5 The collections of the present invention require that the genotypically 

distinct coisogenic cells be in sufficient spatial proximity to one another as readily and 
contemporaneously to be subject to a common experimental protocol, yet remain separately 
assayable. 

Separate assayability can easily be effected by maintaining each of the 

10 genotypically distinct coisogenic cells of the collection in fluid noncommunication with the 
others of the cells of the collection. Spatial proximity can be effected by disposing the cells 
within wells or other types of fluidly noncommunicating locations that are within or upon a 
common structure. 

For example, each genotypically distinct cell (typically, cell line) can be 

15 disposed in a well (or wells) of a microtiter plate distinct from the well (or wells) in which 
genotypically-distinct cells are placed. Microtiter plates are now readily available 
commercially that have 24, 96, 384, 864, 1536, 3456, 6144, and 9600 wells. And variants 
abound. For example, U.S. Patent No. 6,171 ,780 B1 describes low fluorescence multiwell 
platforms for cellular screening assays. U.S. Patent No. 6.103,479 describes methods 

2 0 apparatus for non-uniform micro-patterned arrays of cells. Chiu ef a/., Proc. NatL Acad. Sci. 
USA g7(6):2408-13 (2000) describe the patterned deposition of cells onto surfaces by using 
three-dimensional microfluidic systems. A wide variety of "chip-based", microfluidic devices 
for arraying cells are also now described. See, e.g., U.S. Patent No. 6,086,740 
("Multiplexed microfluidic devices and systems"). 

2 5 Alternatively, the genotypically distinct cells of the collection can be 

maintained in fluid noncommunication by disposing each genotypically distinct cell (typically, 
as a genotypically distinct cell line) in a separate structurally discrete, fluidly 
noncommunicating container, such as a vial, ampule, or tube; spatial proximity can in such 
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cases be effected by packaging the separate containers together. In such cases, the cell 
collections of the present invention take the form of a kit, and it is therefore another aspect 
of the present invention to provide kits comprising the coisogenic cell collections of the 
present invention. 

5 The kits comprise at least five genotypically distinct cells, the cells 

contained within separate, structurally discrete, fluldly noncommunicating containers; the at 
least five stoicturally discrete containers are packaged together. As described above, each 
of the at least 5 genotypically distinct cells is coisogenic with respect the others of the at 
least 5 genotypically distinct cells at a target locus common thereamong. 

1 0 Since the cell collections of the present invention can include a great many 

more than five genotypically distinct cells, the kits of the present invention can usefully and 
additionally include computer-readable media having at least one dataset that defines the 
genotype of the cells of the collection at least at the target locus; the dataset can usefully 
include links to extrinsic databases, such as the Online Mendelian Inheritance of Man 

15 (OMIM) (htlp://www.ncbi.nlm.nih.gov:80/entrez/query.fcgi?db=OMIM)), the Human Gene 
Mutation Database (HGMD) (http://archive.uwcm.ac.uk/uwcm/mg/hgmdO.html), or more 
general databases, such as GenBank, or the UCSC human genome project working draft 
(http://genome,ucsc.edu/). 

Fluid noncommunication is not required where the genotypically distinct 

20 cells can be distinguished even in admixture. In such case, the cells can be contained in a 
common container, such as a tube, ampule, well, or dish; the required spatial proximity is of 
course thus necessarily maintained. 

For example, if the assay measures cell proliferation under a chosen 
condition, such as exposure to a chemotherapeutic agent, e.g. paclitaxel or a derivative 

2 5 thereof, and the cells are individually bar coded, the cells can be commonly cultured in the 

presence of the drug agent, and the degree of individual proliferation assessed by 
stoichiometric amplification and quantification of their respective bar codes. See, e.g., U.S. 
Patent No. 6,046.002, incorporated herein by reference in its entirety. 

Additionally, the coisogenic cell collections of the present invention need 

3 0 not be in a fbnn that can immediately be assayed. Rather, the collections can be provided 

in any physical form that will, at some point, permit the genotypically distinct cultured cells 
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separately to be assayed. In one embodiment, for example, the ceils can be provided 
frozen, either In individual tubes or ampules or collectively in the wells of a microliter dish, 
thereafter to be thawed, propagated, and assayed. Where the cells are yeast cells, the 
cells can conveniently be provided frozen or lyophilized. 
5 The invention further provides, in another aspect, methods of making the 

coisogenic cell collections of the present invention. 

in a basic embodiment, the method comprises collecting at least 5 
genotypically distinct cells, each of the cells being coisogenic with respect to the others of 
the at least 5 genotypically distinct cells at a target locus common thereamong, into a 
10 collection in which each of the genotypically distinct cells can be separately assayed. 

Typically, but not invariably, the method further comprises the earlier step 
of making cells that are coisogenic at a common target locus. The coisogenic cells are 
made by engineering, into at least four of at least five cultured cells, the cells derived from a 
common eukaryotic ancestor cell, a genomic sequence alteration at a target locus common 
15 thereamong; the sequence alterations must be sufficient to cause at least five distinct 
protein sequences collectively to be encoded by the cells at the common target locus. 

The genomic sequence alterations can be created by any means that 
permits mutations to be targeted to genomic sequence. In a presently preferred approach, 
mutations are targeted to a common target locus using modified single-stranded 
2 0 oligonucleotides ("targeting oligonucleotides"). 

We have recently described methods for targeting single nucleotide 
changes directly into long pieces of genomic DNA present within YACs, BACs, and even 
intact cellular chromosomes through use of sequence-altering oligonucleotides. See 
intemational patent publication nos. WO 01/73002, WO 01/92512, and WO 02/10364; and 

2 5 commonly owned and copending U.S. provisional patent application nos. 60/326,041 , filed 

September 27, 2001 . 60/337.129, filed December 4, 2001 , 60/393,330, filed July 1 , 2002, 
60/363,341. filed March 7. 2002; 60/363,053, filed IVIarch 7, 2002. and 60/363,054. filed 
March 7, 2002, the disclosures of which are incorporated herein by reference in their 
entireties. These methods, described in further detail below, are presently preferred. 

3 0 Other approaches for targeting sequence changes using sequence altering 

oligonucleotides have also been described. See e.g. U.S. Patent Nos. 6,303,376; 
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5,776.744; 6,200.812; 6.074,853; 5.948.653; 6,136,601; 6.010,907; 5,888,983; 5,871.984; 
5,760,012; 5.756.325; and 5,565,350, the disclosures of which are incorporated herein by 
reference in their entireties. These latter approaches typically have lower efficiency and are 
at present less prefenred, although they may at times be used. 
5 Changes can be targeted directly into cellular chromosomes within cultured 

eukaryotic cells. In other embodiments, changes can instead be targeted to recombinant 
constructs in vitro, with the modified target thereafter used to Integrate the desired change 
into a cultured eukaryotic cell. 

The first of these approaches is particularly prefenred for creating 
10 coisogenic cell collections that are legacy-free, and/or exceptionally or perfectly coisogenic. 
The second approach is preferred, inter alia, in construction of coisogenic cell collections 
having identical targeted changes superimposed on different genetic backgrounds. 

In the latter approach, the vector is usefully an artificial chromosome, such 
as YACs (yeast artificial chromosomes), BACs (bacterial artificial chromosomes), PACs 
15 (P-1 derived artificial chromosomes), HACs (human artificial chromosomes), and PLACs 
(plant artificial chromosomes). 

Artificial chromosomes are reviewed in Larin et al.. Trends Genet. 
18(6):313-9 (2002); Choi ef a/., l\Aethods Mol. Bid. 175:57-68 (2001); Brune ef al.. Trends 
Genet 16(6):254-9 (2001); Ascenzloni etai, Cancer Lett. 118(2):135-42 (1997); Fabb ef 
20 al., m Celi: Biol. Hum. Dis. Ser. 5:104-24 (1995); Huxley, Gene Ther. 1(1):7-12 (1994), 
the disclosures of which are incorporated herein by reference in their entireties. Other 
vedore that may be used include viral, typically eularyotic viral, vectors, such as 
adenoviral, varicella, and herpesvirus vectors. 

Yeast artificial chromosomes (YACs) are additionally described in Burke et 
25 al. Science 236:806; Peterson ef a/.. Trends Genet. 13:61 (1997); Choi ef al., Nature 
Genet., 4:117-223 (1993); Davies etai, B/ofec/j/?o/ogy 11:911-914 (1993); IVIatsuura etai, 
Hum. Mol. Genet, 5:451-459 (1996); Peterson ef al.. Proc. Natl. Acad. Sci., 93:6605-6609 
(1996); and Schedl etai., Cell, 86:71-82 (1996)). Human artificial chromosomes (HACs) 
are additionally described in Kuroiwa etai., Nature Biotechnol. 1 8(1 0):1 086-90 (2000); 
3 0 Henning et al., Proc. Natl. Acad. Sd USA 96(2):592-7 (1999); Harrington ef al., Nature 
Genet 15(4):345-55 (1997). Bacterial artificial chromosomes (BACs) and P-1 derived 
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artificial chramosomes (PACs) are further described in Me\\a et al. Genome Res. 7:179-186 
(1997); Shizuya ef a/., Proc. NatL Acad. ScL 89:8794-8797 (1992); loannou ef a/.. Nature 
Genet, 6:84-89 (1994); Hosoda etaL, Nucleic Acids Res. 18:3863 (1990). Other vectors 
useful in the present invention are further described in Sternberg e( a/., Proc. NatL Acad. 
5 Sc/.US/\ 87:103-107 (1990). 

BACs have been developed for transformation of plants with high- 
molecular weight DNA using the T-DNA system (Hamilton, Gene 24:107-1 16 (1997); Frary 
eta/., Transgenic Res. 10: 121-132 (2001)). 

In certain useful embodiments, genomic targets are present within vectors 

1 0 that permit integration of the target into a cellular chromosome. In particularly useful 
embodiments, genomic targets are present within vectors that permit site-directed 
integration of the target into a cellular chromosome. Usefully, the vector is an artificial 
chromosome and site-specific integration may be performed by recombinase mediated 
cassette exchange (RMCE). 

15 In RMCE, a region of DNA (cassette) desired to be integrated into a 

specific cellular chromosomal location is flanked in a recombinant vector by sites that are 
recognized by a site-specific recombinase, such as loxP sites and derivatives thereof for 
Ore recombinase and FRT sites and derivatives thereof fpr Flp recombinase. Other 
site-specific recombinases having cognate recognition/recombination sites useful in such 

2 0 methods are known (see, e.g., Blake et a/., MoL Microbiol. 23(2):387-98 (1997)). 

The site in the cellular chromosome into which the cassette is desired 
site-specifically to be integrated Is analogously flanked by recognition sites for the same 
recombinase. 

To favor a double-reciprocal crossover exchange reaction between vector 

2 5 and chromosome, two approaches are typical. In the first, the two sites (such as lox or 

FRT) that flank the cassettes in both vector and cellular chromosome are heterospeciftc: 
that is, they differ from one another and recombine with each other with far lower efficiency 
than with sites Identical to themselves. In the second, the lox or FRT sites are inverted. 
See, e.g., Baer et ai, Curr. Opin. BiotechnoL 12:473-480 (2001); linger ef a/., NucL Acids 

3 0 Res. 30:3067-3077 (2002); Feng ef a/., J. Mol Biol. 292:779-785 (1999), the disclosures of 

which are incorporated herein by reference in their entireties. 
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Recombinational exchange of the cassettes from vector to cellular 
chromosome, with integration of the construct cassette site-specifically into the cellular 
chromosome, is effected by introducing the recombinant construct into the cell and 
expressing the site-specific recombinase appropriate to the recombination sites used. The 
5 site-specific recombinase may be expressed transiently or continuously, either from an 
episome or from a constnict integrated into cellular chromosome, using techniques well 
known in the art 

Site-specific recombinational insertion provides a single-copy integrant of 
defined and chosen sequence in a defined cellular genomic milieu. It is known that such 
1 0 site-specific integratfon provides more consistent expression than does random integration. 
Feng et ai. J. Mol. Biol. 292:779-285 (1999). 

Our presently prefened methods for targeting single nucleotide changes 
directly into genomic DNA ~ whether targeted directly into a eukaryolic chromosome or first 
targeted into a recombinant construct In vitro - are further described in international patent 
15 publfcation nos. WO 01/73002, WO 01/92512, and WO 02/10364; and commonly owned 
and copending U.S. provisional patent application nos. 60/326,041, filed September 27. 
2001, 60/337,129, filed December 4, 2001, 60/393,330, filed July 1, 2002, 60/363,341 , filed 
March 7, 2002; 60/363,053, filed March 7, 2002, and 60/363,054, filed March 7, 2002; the 
disclosures of which are incorporated herein by reference in their entireties. 

20 Briefly, the method comprises combining the targeted nuclefc add, in the 

presence of cellular repair proteins, with a single-stranded oligonucleotide 17 - 121 
nucleotides in lengtti, the oligonucleotide having an Intemally unduplexed domain of at least 
8 contiguous deoxyribonudeotides. The oligonucleotide is fully complementery in sequence 
to the sequence of a first strand of the nudeic acid teiget, but for one or more mismatches 

25 as between the sequences of the intemally unduplexed deoxyribonudeotide domain and its 
complement on the target nucleic add first strand. Each of the mismatches is positioned at 
least 8 nucleotides from each of the oligonudeotide's 5' and 3" temiini, and the 
oligonudeotide has at least one tenninal modification. 

The oligonudeotide tenninal modification is typically selected from the 

3 0 group consisting of at least one temiinal tocked nucleic acid (LNA), at least one terminal 
2'-0-Me base analog, and at least three tenninal phosphorothioate linkages. 
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LNAs are bicyclic and tricyclic nucleoside and nucleotide analogs and the 
oligonucleotides that contain such analogs. The basic structural and functional 
characteristics of LNAs and related analogues that usefully may be incorporated into the 
second ("annealing") ollgonucleotide in the methods of the present invention are disclosed 
5 in various publications and patents, including WO 99/14226, WO 00/56748, WO 00/66604. 
WO 98/39352, U.S. Patent No. 6,043,060, and U.S. Patent No. 6,268,490, the disclosures 
of which are incorporated herein by reference in their entireties. See also Singh ef a/., 
Chem. Commun, 1998: 455; Koshl<in ef a/., Tetrahedron 54:3607 (1998); Koshl^in ef a/., 
Tetrahedron Lett. 39:4381 (1998); Singh ef a/., Chem, Commun. 1998:1247, and are 
10 reviewed in Drum ef a/., "Locked nucleic acids: a promising molecular family for 
gene-function analysis and antisense drug development," Curr, Opin. MoL Then 
3(3):239-43 (2001), the disclosures of which are incorporated herein by reference in their 
entireties. 

Synthesis of LNA nucleosides and nucleoside analogs and oligonucleotides 
15 that contain them may be performed as disclosed in WO 99/14226, WO 00/56748, 

WO 00/66604, WO 98/39352. U.S. Patent No. 6,043.060, and U.S. Patent No. 6,268,490. 

Many may now be ordered commercially (Exiqon, Inc., Vedbaek, Denmark; Proligo LLC, 

Boulder. CO, USA). 

The oligonucleotides are typically at least 17 nucleotides in length, and can 
2 0 usefully be up to about 1 21 nucleotides in length, and even longer, although targeting 

oligonucleotides of about 17 to about 74 nucleotides in length are at present preferred. The 

oligonucleotides used to create the coisogenic cell collections may thus have lengths of 17, 

18, 19, 20, 21. 22, 23, 24, 25. 26, 27, 28, 29. 30, 31, 32. 33, 34. 35, 36. 37. 38, 39, 40, 41. 

42, 43, 44, 45, 46, 47, 48. 49, 50, 51, 52, 53. 54, 55, 56. 57, 58. 59, 60. 61, 62, 63, 64, 65, 
2 5 66, 67, 68, 69, 70, 71. 72. 73, 74, 75, 76. 77. 78. 79. 80. 81, 82. 83, 84. 85, 86, 87. 88, 89. 

90, 91, 92. 93, 94. 95, 96. 97. 98. 99, 100. 101. 102, 103. 104, 105, 106. 107. 108, 109, 

110, 111, 112, 113, 114, 115, 116. 117. 118. 119, 120. or 121 nt 

At present most prefen'ed are targeting oligonucleotides at least about 25 

bases in length, unless there are self-dimerization structures within the oligonucleotide; if 
30 the oligonucleotide has such an unfavorable structure, lengths longer than 35 bases are 

preferred. 
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15 



The internally unduplexed alteration domain of the targeting oligonucleotide 
is preferably fully complementary to one strand of the target locus, except for the 
mismatched base (or up to about 3 mismatched bases) introduced to effect the gene 
alteration or conversion events. The central alteration domain is generally at least 
5 8 nucleotides in length. Although it is presently preferred to locate the 

alteration domain approximately in the middle of the targeting oligonucleotide, there is no 
strict requirement for symmetrical extension adjacent to the altefation DNA domain. 
However, the base(s) targeted for alteration In the most prefen^d embodiments are at least 
about 8, 9 or 10 bases from each of the ends of the targeting oligonucleotide. 
^ ° "I^e targeting oligonucleotide preferably binds to the non-transcribed strand 

of a genomic DNA duplex. 

The oligonucleotides used to make the coisogenic cell collections of the 
present invention preferably contain more than one of the aforementioned modifications 
("backbone modifications"), preferably (but not oWlgately) at both ends of the 
oligonucleotide. In some embodiments, the backbone modifications are adjacent to one 
another. For oligonucleotides of the invention that are longer than about 17 to about 25 
bases in length, internal as well as terminal regfon segments of the backbone can be 
altered. 

The optimal number and placement of backbone modifications for any 
Individual oligonucleotide will vary with the length of the oligonucleotide and the particular 
type of backbone modification(s) that are used, and may be determined by routine 
comparative studies, as further described in WO 01/73002 and commonly owned and 
copending U.S. patent application serial no. 09/818,875, filed March 27. 2001, the 
disclosures of which are Incorporated herein by reference In their entireties. 

TTie sequence-altering oligonudeof de can be contacted to Its genomic 
target within intact cells, within cell-free protein extracts having cellular repair proteins, or 
within purified protein firactions having cellular repair proteins. 

Efficiency of conversion Is defined herein as the percentage of recovered 
substrate molecules that have undergone a conversion event Depending on the nature of 
30 the target genetic material, e.g. the genome of a cell or a genomic construct in a replicable 
vector, efficiency can be represented as the proportion of cells or clones containing an 



20 



25 
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extrachromosomal element that exhibit a particular phenotype. Alternatively, representative 
samples of the target genetic material can be sequenced to detennine the percentage that 
have acquired the desire change. 

Efficiency can be increased using the methods set forth In commonly 
5 owned and copending U.S. provisional application serial nos. 60/363,341 , filed March 7, 
2002; 60/363,053, filed March 7, 2002; and 60/363.054, filed March 7, 2002, the disclosures 
of which are Incorporated herein by reference in their entireties. 

In the first of these methods, the eukaryotic cell to be targeted, or that 
provides the protein extract having cellular repair enzymes within which a recombinant 

1 0 construct is targeted, is first contacted with an inhibitor of histone deacetylase (HDAC), such 
as Trichostatin A. In the second of these methods, the sequence-altering oligonucleotide is 
contacted with the genomic target - either within a cell or within a cell extract - in the 
presence of lambda beta protein. In the third of these methods, the eul<aryot(c cell to be 
targeted, or that provides the protein extract within which a recombinant construct is 

1 5 targeted, is first contacted with hydroxyurea. 

Targeting efficiency may also be increased using the methods set forth in 
U.S. provisional patent application serial nos. 60/325,992, filed September 27, 2001; 
60/337,129, filed December 4. 2001; and 60/393,330, filed July 1, 2002, the disclosures of 
which are incorporated herein by reference in their entireties, and in U.S. provisional 

2 0 application serial nos. 60/220,999, filed July 27, 2000; and 60/244,989, filed October 30, 
2000, the disclosures of which are incorporated herein by reference in their entireties. 

In various of these methods, the cell or cell-free extract within which 
targeting is performed has altered levels or activity of at least one protein from the RAD52 
epistasis group, the mismatch repair group or the nucleotide excision repair group, such as 

2 5 reduced levels or activity of at least one protein selected from the group consisting of a 
homolog, ortholog or paralog of RAD1 , RAD51 , RAD52, RAD57 and PMS1 . 

In others of these methods, the cell or cell-free extract within which 
targeting is performed has increased levels or activity of at least one of RADIO, RAD51 , 
RAD52, RAD54, RAD55, MRE1 1, PMS1 or XRS2 proteins and decreased levels or activity 

30 of at least one other protein selected ftx)m the group consisting of RAD1 . RAD51 , RAD52, 
RAD57orPMS1. 



wo 03/027264 



PCT/US02/31180 



- 75 - 

The targeting oligonucleotides can introduce more than a single base 
change in a single step. For example, in an oligonucleotide that is about a 70-mer. with at 
least one modified residue incorporated on each of the two ends, multiple bases up to 27 
nucleotides apart can be targeted. However, when the targeting oligonucleotide includes 
5 multiple sequence changes, not all ti^nsfonnants will include all genetic changes: there is a 
frequency distribution such that the closer the target bases are to each other in the 
alteration domain, the higher the frequency of change In a given cell Target bases only two 
nucleotides apart are changed together in every case that has been analyzed. The farther 
apart the two target bases are. the less frequent the simultaneous change. 
1 0 Thus, in creating the coisogenic cell collections of the present invention, 

targeting oligonucleotides can be used to alter multiple bases at the target locus, rather than 
just a single base. Furthermore, iterative rounds of targeting can be perfomied to introduce 
multiple changes. 

In embodiments in which the genome is targeted directly in the ceil, the 
15 targeting oligonucleotides can be introduced into the cell by any means known in the art, 
such as through use of polycations, cationic lipids, liposomes, polyethylenimine (PEI), 
electroporation, biolistics, microinjection and other methods known In the art to facilitate 
cellular uptake; indeed, at times the targeting oligonucleotides can be introduced by simple 
incubation without any adjunctive means. 
20 In alternative embodiments, the targeting oligonucleotide can be used to 

introduce the alteration into a genomic DMA construct, with the altered construct theroafter 
introduced into the cells by knovwi transfection techniques. Typically, the altered construct 
is far larger than the targeting oligonucleotide, and is sufficient in length to act as a 
substrate for subsequent homologous recombination with the cellular chromosome. 

2 5 The coisogenic cell collections of the present invention are useful for 

screening for the phenotypic effects of changes in the protein sequence encoded at a taiget 
locus. Because the cells of the collection are coisogenic, phenotypic differences detected 
among the cells of the collection can more reliably be ascribed to the differences in 
sequence at the target locus than in assays using genetically more heterogeneous cells in 

3 0 which additional changes at the target locus, or further changes at loci other than the target 

locus, can confound the analysis. Furthennore, given the ability readily to include within the 
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collection of the present invention coisogenic cells that collectively have changes at many 
(including all) of the amino acids encoded at the target locus, the coisogenic cell collections 
of the present invention are extremely useful for dissecting structure activity relationships 
within proteins. 

5 Thus, in another aspect, the invention provides a method of identifying 

genotypes of a target locus that alter a cellular phenotype. 

The method comprises assaying each genotypically distinct cell of a 

coisogenic cell collection of the present invention for a common phenotypic characteristic; 

the genotypically distinct cells are coisogenic at a desired target locus. From the assay 
10 results, at least one genotypically distinct cell is identified within the collection that has an 

alteration in the assayed phenotypic characteristic {i.e., that exhibits an altered phenotype). 

Assay results are con-elated with the target locus genotype, the correlation identifying 

genotypes of the target locus that cause an alteration of the cellular phenotype. 

The phenotypic characteristic can be any cellular characteristic relevant to 
15 the target locus that can be assayed in vitro. A wide variety of such in vitro assays exist, 

and the principles for design of such assays are by now well known; accordingly, details will 

not here be presented. 

Briefly, however, and solely by way of example, where tiie target locus is, 

for example, a steroid receptor, the phenotypic characteristic can be the detectable 
2 0 translocation of the receptor from cytoplasm to nucleus upon contact of the cells to the 

receptor's cognate ligand, as is described, inter alia, in U.S. Patent No. 5,989,835. The 

phenotypic characteristic where the target locus encodes a steroid honmone receptor can 

alternatively (or additionally) be the expression of a detectable reporter, such as a 

fluorescent protein (e.g., GFP), driven from a homione-responsive promoter. In this latter 

2 5 case, the assay depends upon the presence commonly within the cells of the coisogenic 

collection of a recombinant reporter construct. The recombinant construct can be present 
within the cells either on an episome or, usefully, Integrated into the cellular genome at a 
locus elsewhere than at the target locus. 

Where the target locus encodes a protein known to affect dmg 

3 0 responsiveness, such as those described in detail above, the cellular characteristic to be 

assayed can be as simple and fundamental as degree of cell death, or can alternatively (or 
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additionally) be, for example, the degree of cellular proliferation, degree of metabolic 
activity, and/or the degree of apoptosis. Appropriate assays are described in several 
compendia, such as Apoptosis and Ceil Proliferation. 2"^ ed., Boehringer Mannheim, 1998 
(available on-line at 

5 http://biochem.boehringer-mannheim.com/prodJnf/manuals/celLman/acp.pdf), and Poirier 
(ed.), Apoptosis Techniques and Protocols. Humana Press, 1997 (ISBN: 0896034518), the 
disclosures of which are incorporated herein by reference. In addition, a wide variety of 
assay kits are available commercially (e.g., the CellTiter 96® AQueous Non-Radioactive 
Cell Proliferation Assay, catalogue no. G5421, Promega, Madison, Wl, which is a 

1 0 oolorimetric method for determining the number of viable cells in proliferation, cytotoxicity or 
chemosensitivity assays; the Apoptosis Detection System, Fluorescein, catalogue no. 
G3250, and the DeadEnd™ Colorimetric Apoptosis Detection System, catalogue no. 
G7360, both from Promega, Madison, Wl; ApoAlert™ Apoptosis Detection Kits, Clontech 
Labs, Palo Alto, CA, USA). 

15 Where the target locus encodes a protein known to affect drug 

responsiveness by transport of the dmg from the cell interior to the medium, the 
characteristic to be assayed can alternatively, or additionally, be accumulation or efflux of 
the drug of interest or proxy therefor. Assays are now well known that permit such 
accumulation and/or efflux to be measured. 

20 For example, U.S. Patent Nos. 6,277,655 and 5,872,014, incorporated 
herein by reference in their entireties, describe assays for activity of ABCB1 (MDR1) based 
upon fluorescent detection of the degree of cellular accumulation of free calcein after 
exposure to an acetoxymethyl ester or acetate ester of calcein. Ludescher et aL, Br. J. 
Haematol 82(1):161-8 (1992) describe a flow cytometric assay for ABCB1 activity based 

25 upon degree of intracellular accumulation of riiodamine 123. Gheuens ef a/.. Cytometry 
12(7):636-44 (1991), describe flow cytometric double labeling techniques for assay of 
multidrug resistance. Cano-Gaucief a/., Biochem. Bioptiys. Res. Commun, 167(1):48-53 
(1990) describe a fast kinetic analysis assay for drug transport in multidrug resistant cells 
using a pulsed quench-flow apparatus. Van Acker et al, Leukemia 9:1398-406 (1 995) 

3 0 describe a rapid flow cytometric functional assay for P-glycoprotein (encoded by ABCB1) 
using fluo-3. Other assays are reviewed in Hoffman, "In vitro assays for chemotherapy 
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sensitivity." Crit. Rev. Oncol. HematoL 15(2):99-111 (1993); Cree efa/., Tumor 
chemosensitivity and chemoresistance assays." Cancer 78(9):2031-2 (1996). 

The assay can detect a phenotypic characteristic under static 
environmental conditions, or can instead can detect a phenotypic characteristic during or 
5 after an alteration in the cellular environment. In a useful embodiment of this latter i 
approach, the coisogenlc collection of cells is first exposed to a xenobiotic, usefully a known 
or potential therapeutic agent, and a characteristic of the cells measured thereafter. 

Analogously, the assay can detect an equilibrium or othenwise static aspect 
of the phenotypic characteristic, or can detect kinetic changes in the phenotypic 
1 0 characteristic. For example, in an assay for cytoplasm to nuclear translocation of a steroid 

receptor, the assay can measure the static nuclearcytoplasmic ratio of the receptor or can, j 
in the alternative or in addition, measure the rate of translocation from cytoplasm to nucleus. 

The assay can be quantitative or qualitative, manual or automated. 

From the assay results, at least one cell is identified that has an altered 
15 cellular phenotype. 

As would be well understood, not all genotypic changes at the target locus 
will affect the measured phenotypic characteristic. In order, however, to identify residues of 
the target protein whose change (by way of substitution, deletion, elimination by truncation, 
etc.) affects a phenotypic characteristic, at least one cell must be identified that has an 
• 2 0 alteration in the assayed phenotypic characteristic. 

That said, data on residues of the protein encoded at the target locus that 
are tolerant of substitution are also tremendously useful, and in another aspect, therefore, 
the invention provides the converse method, in which residues tolerant of alteration are 
identified; in this latter method, conrelation of the target locus genotype of cells that do not 

2 5 exhibit change in the assayed phenotypic characteristic identifies residues tolerant of 

substitution. 

As would be readily understood, the "altered phenotype" is altered relative 
to a chosen control. The control is typically a coisogenlc cell, typically in the same 
collection, that has a desired reference target locus sequence. The desired reference target 

3 0 locus sequence can. for example, be that of the parent cell (typically, cell line) from which 

the coisogenlc cells of the collection have been engineered; that which is most commonly 
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observed in a given population (e.g., the predominant allelic variant of the target locus in a 
chosen human population); or one chosen based upon prior-determined results of a 
phenotypic assay. 

Following the assay, the results of the phenotypic assay are correlated with 
5 the cells* respective target locus genotypes. 

The correlation can be performed either before or after identifying, from the 
assay results, at least one cell with altered cellular phenotype. If performed after the subset 
with altered phenotypic characteristic is identified, the correlation of phenotype with target 
locus genotype can be limited to that subset; if perfonned before the subset with altered 
10 phenotype is identified, as would typically be the case in high throughput applications of the 
methods of the present invention, the conrelation of phenotype with target locus genotype 
would typically be made for all cells of the coisogenic cell collection. 

In either case, the correlation of the subset's phenotypic assay results with 
their respective target locus genotypes identifies those genotypes of the target locus that 
15 cause an alteration of the cellular phenotype. 

ConBlation can be as simple as noting a change in phenotype for a given 
genotype, such as an increase in cytotoxicity occasioned by contact with a 
chemotherapeutic agent In a cell having a change in a specific ABCB1 amino acid. 
Altematively, or in addition, conrelation can be performed using statistical algorithms known 
20 In the art. 

Where the coisogenic cell collection includes cells that collectively include 
changes at each amino acid of the protein encoded at the target locus (typically excluding 
changes of the initiator methionine), conrelation of phenotype with genotype can identify all 
residues of the protein that are critical to its function. Where the coisogenic cell collection 

25 includes cells that collectively include each of the 20 natural amino acids at a single residue 
location, typically a residue previously shown or suspected to contribute to protein function, 
correlation of phenotype with genotype can identify with precision the structural 
requirements for function at that residue. Where the coisogenic cell collection includes one 
or more cells that have a naturally-occurring allelic variant of the target locus, or that encode 

30 a protein having a sequence identical to that encoded by a naturally-occuning allelic variant 
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of the target locus, correlation of phenotype with genotype allows the phenotypic effects of 
such natural variants readily to be assessed in the context of a uniform genetic background. 

In one series of embodiments, the method is used to identify genotypes 
that alter the cellular responsiveness to xenobiotics, which will typically be known or 
5 potential therapeutic agents. 

In such embodiments, as well as in other embodiments of the methods of 
the present invention, the target locus at which the cells of the collection are coisogenic can 
usefully be selected from the group consisting of: CYP1A2, CYP2C17, CYP2D6, CYP2E, 
CYP3A4, CYP4A11. CYP1B1, CYP1A1, CYP2A6. CYP2A13. CYP2B6, CYP2C8, CYP2C9, 
10 CYP11A, CYP2C19, CYP2F1, CYP2J2, CYP3A5, CYP3A7, CYP4B1 , CYP4F2, CYP4F3. 
CYP6D1. CYP6F1, CYP7A1, CYP8, CYP11A, CYP11B1. CYP11B2 , CYP17, CYP19, 
CYP21A2, CYP24, CYP27A1, CYP51, ABCB1, ABCB4, ABCC1, ABCC2, ABCC3, ABCC4, 
ABCC5, ABCC6, MRP7, ABCC8. ABCC9, ABCC10, ABCC1 1 , ABCC12, EPHX1 , EPHX2, 
LTA4H, TRAG3, GUSB, TMPT, BCRP, HERG, hKCNE2, UDP glucuronosyl transferase 
15 (UGT), sulfotransferase, sulfatase, glutathione S-transferase (GST) -alpha, glutathione S- 
transferase -mu, glutathione S-transferase -pi, ACE, and KCHN2. 

The method can usefully include a step, before assay, of contacting the 
coisogenic cell collection with a xenobiotic, typically a known or potential therapeutic agent. 
Potential therapeutic agents can be natural products or products of a combinatorial 
20 chemical synthesis. 

The method can also usefully include a later step, after the correlations 
have been made, of collecting the correlations into at least one dataset; the dataset is often, 
but not necessarily, recorded on a computer-readable medium. In such case, the dataset 
can thereafter usefully be queried, e.g. to predict a cellular phenotype based upon the 

2 5 genotype at the relevant target locus. 

Thus, in another aspect, the invention provides a method of predicting a 
phenotypic characteristic of a cell based upon its genotype at a target locus. The method 
comprises using the celFs genotype at a chosen target locus, or a unique identifier thereof, 
as a query to retrieve from a dataset data that report a phenotypic characteristic con-elated 

3 0 with the target locus genotype. The dataset that is queried in this method includes 

correlations from at least five cells that are coisogenic at the target locus. The phenotypic 
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characteristic retrieved from query of the dataset provides a prediction of the cell's 
phenotypic characteristic. 

The target locus "genotype" to be used as a query can be obtained by any 
means known in the art, including sequencing of the genomic DMA of the target locus. 
5 sequencing of the mRNA transcript from the target locus, sequencing of the protein 
encoded at the target locus, or any of the known methods for identifying allelic variants at a 
given locus, such as those set forth in U.S. Patent Nos. 5.952,174. 5.846,710. 5,710.028 
and 5,679,524, and those reviewed in Kwok, "High-throughput genotyping assay 
approaches." Pharmacogenomics 1(1):95-100 (2000). the disclosures of which are 
1 0 incorporated herein by reference. In addition, apparatus is now available commercially that 
permits the ready identification of allelic variants at a chosen target locus, such as the 
SnIPer™ High Throughput SNP Scoring System (Amersham Pharmacia Biotech, 
Piscataway, NJ, USA) and the SNPstream™ (Orchid Biosciences, Princeton, NJ, USA). 

The cell for which the genotype is to be used as query can be a cultured 
15 cell or, alternatively, can be a noncultured cell derived directly from a eukaryotic organism. 
In the latter case, the genotype can be obtained, for example, from cells, such as circulating 
blood cells, that are replenishable in vivo. The cell for which the genotype is determined 
can be nomially present In the eukaryotic organism or can be aben^nt or otherwise 
diseased. 

20 Usefully, the target locus genotype can be obtained from cells of a human 

being. 

The query itself can include the entirety of the nucleic acid or protein 
sequence of the taiget locus, a portion of the nucleic acid or protein sequence of the target 
locus, even a single nucleotide or protein identifier and base or residue number that can 
2 5 serve as a unique identifier of the target locus genotype. Methods are well known in the 
bioinfomiatic arts for querying databases having sequence-related information. 

The dataset to be queried includes congelations derived from at least five 
cells that are coisogenic at the target locus. Typically, the coisogenic cells will have been a 
cell collection according to the present invention. 
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Where the cellular genotype used as query is derived from a human being, 
the above-described methods provide a streamlined approach to pharmacogenomic 
analysis. 

An antecedent to traditional phannacogenomic studies is the identification 
5 of a large number of naturally-occurring allelic variants, and correlation of the naturally- 
occurring alleles with naturally-occumng clinical phenotypes. Only then can a patient's 
genotype be used to predict the patient's probably clinical phenotype. 

In contrast, the coisogenic collections of eukaryotic cells of the present 
invention allow all possible alleles readily to be constructed, and the resulting cellular 
1 0 phenotypes to be correlated with target locus genotype. Where the cellular phenotype can 
correlated with the phenotype of the entire organism, as can readily be done with loci that 
affect responsiveness to xenobiotics, the dataset of con^elated phenotypes can provide 
reliable phenotypic predictions, even for alleles that had not previously been identified within 
the natural population. 

15 Thus, in certain particularly useful embodiments, the query genotype is 

from a human cell, and the target locus is selected from the group consisting of CYP1 A2, 
CYP2C17, CYP2D6, CYP2E, CYP3A4, CYP4A11, CYP1B1. CYP1A1, CYP2A6, CYP2A13, 
CYP2B6, CYP2C8, CYP2C9, CYP11A. CYP2C19, CYP2F1. CYP2J2, CYP3A5, CYP3A7. 
CYP4B1, CYP4F2, CYP4F3, CYP6D1, CYP6F1. CYP7A1, CYP8. CYP11A, CYP11B1, 

20 CYP1 1 82 , CYP17, CYP19, CYP21 A2, CYP24. CYP27A1 , CYP51 . ABCB1 , ABCB4. 
ABCC1. ABCC2, ABCC3, ABCC4. ABCC5, ABCC6, MRP7, ABCC8. ABCC9, ABCC10. 
ABCC11, ABCC12, EPHX1, EPHX2, LTA4H. TRAG3, GUSB, TMPT, BCRP, HERG, 
hKCNE2, UDP glucuronosyl transferase (UGT), sulfbtransferase. sulfatase, glutathione S- 
transferase (GST) -alpha, glutathione S-transferase -mu, glutathione S-transferase -pi, 

2 5 ACE, and KCHN2, and the cellular phenotypic characteristic can usefully be cellular 

responsiveness to a xenobiotic; in such case, the prediction can be a prediction of an 
individual's potential responsiveness to that xenobiotic agent. 

3 0 The following examples are offered for purpose of illustration and not by 

way of limitation. 
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EXAMPLE 1 
Coisogenic Eukaryotic Cell Collections 
Having Natural Allelic Variants of ABCB1 (MDR1) 

Targeting oligos are used to create a cell collection coisogenic at the 
5 human ABCB1 (MDR1 ) locus. The targeting oligonucleotides include terminal 

modifications as set forth above, including at least one phosphorothiate linlcage, and are 
introduced in parallel into separate aliquots of HBL100 cells using standard techniques. 
Potential cellular tranfonnants are propagated in vitro, cloned, and clonal cell lines having 
the desired targeted change identified by sequencing DMA amplified from the ABCB1 locus. 

1 0 The targeting oligos have sequences (presented in Table 35, below) 

designed to create natural allelic variants of the ABCB1 gene, creating a legacy-free, 
perfectly coisogenic cell collection in which the naturally occurring alleles of ABCB1 are 
presented on the identical genetic background of a human breast epithelial cell line. 

The left-most column of the table identifies the alteration ttiat converts the 

15 wild type to the variant allele, at botii tiie amino acid and the nucleotide level. At the amino 
acid level, mutations are presented according to the following standard nomenclature. The 
centered number identifies tiie position of the mutated codon in ttie protein sequence; to tiie 
left of ttie number is the wild type residue and to the right of tiie number is ttie mutant 
codon. At ttie nucleic acid level, tiie entire triplet of the wild type and mutated codons is 

20 shown. 

The middle column presents, for each alteration (mutation), four 
oligonucleotides capable of changing the wild type sequence site-specifically to the 
identified allelic variant. 

All oligonucleotides are presented, per convention, in the 5* to 3' 

2 5 orientation. The nucleotide that effects the change in the genome is underiined and 

presented in bold. 

The first of the four oligonucleotides for each mutation is a 1 21 nt 
oligonucleotide centered about the altering ("repair") nucleotide. The second 
oligonucleotide, its reverse complement, targets the opposite strand of the DMA duplex for 

3 0 change ("repair"). The third oligonucleotide is the minimal 17 nt domain of the first 
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oligonucleotide, also centered about the repair nucleotide. The fourth oligonucleotide is the 
reverse complement of the third, and thus represents the minimal 17 nt domain of the 
second. 

The third column of the table presents the SEQ ID NO: of the respective 
5 targeting oligonucleotide. 



Table 35 

ABCB1 (MDR1) Targeting Oligos to Create Natural Alleles 


Allelic Variation 


Sequence of 
Targeting Oligos 


SEQ ID NO: 


Asn21Asp 
AAT-GAT 


AIGGATCHGAAGGGGA 

GCGCAATGGAGGAGCAA 

AGAAGAAGAACIIIIIIA 

AACTGAAC6ATAAAAGG 

TAACTAGCTTGTrTCATT 

nCATAGTTTAGATAGn 

GCGAGATTTGAGTAAT 


1 




ATrACTCAAATCTOGCAA 

CTATGTAAACTATGAAAA 

TGAAAGAAGCTAGTTACC 

1 1 1 lATCGnCAGTHAA 

AAAAGnCTTCTTCTnG 

CTCGTCCATTGCGGTCC 

CCnCAAGATCCAT 


2 




AACTGAACGATAAAAGG 


3 




CCTTTTATCGTTCAGn 


4 


Phe103Ser 

rrc-TCC 


AAGAGACATAAATGGTAT 

GTTrGTTTTGTGGTGGTC 

TAGGTGATATCAATGATA 

CAGGGTCCTTCATGAAT 

CTGGAGGAAGACATGAC 

CAGGTAAnAGAGATTCT 

CCnACTATTGnAA 


5 
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Table 35 

ABCB1 (MDRI) Targeting Oligos to Create Natural Alleles 



Allelic Variation 


Sequence of 
Targeting Oligos 


SEQ ID NO: 




TTAACAATAGTAAGGAGA 

ATGTCTAATTACCTGGTC 

ATGTCTTCCTCCAGATrC 

ATGAAGGACCCTGTATC 

AnGATATCACCTAGACO 

ACCACAAAACAAACATAG 

CATTTATGTCTCTT 


6 




TACAGGGTCCnCATGA 


7 




TCATGAAGGACCCTGTA 


8 


Phe103Leu 
TTC-CTC 


AAAGAGACATAAATGGTA 
TGTTTGTTTTGTGGTGGT 
u 1 Abb 1 bA 1 A 1 L»AA 1 bA 1 

ACAGGGCTCnCATGAA 
TCTGGAGGAAGACATGA 
COAGGTAATTAGACATTC 
TCCTTACTATTGnA 


9 




TAACAATAGTAAGGAGAA 

TGTCTAAnACCTGGTGA 

TGTCTTCCTCCAGATTCA 

TGAAGAGCCCTGTATCA 

TTGATATCACCTAGACCA 

CCACAAAACAAACATAGC 

AnTATGTCTCTTT 


10 

1 




ATACAGGGCTCnCATG 


11 




CATGAAGAgCCCTGTAT 


12 


GIy185VaI 
GGA-GTA 


TTCTGACAATTAmCTA 

ACACTATCTGTTGTnCA 

GTGATGTCTCCAAGATTA 

ATGAAGTAATTGGTGAGA 

AAATTGGAATGnCTTTC 

AGTCAATGGCAACATITT 

TCACTGGGTTTAT 


13 
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Table 35 

ABCB1 (MDR1) Targeting Oligos to Create Natural Alleles 



Allelic Variation 


Sequence of 
Targeting Oligos 


SEQ ID NO: 




ATAAACCCAGTGAAAAAT 

GnGCCATTGACTGAAA 

GAACATTCCAATTTTGTC 

ACCAATTACTTCATTAAT 

CnGGAGACATCACTGA 

AAGAACAGATAGIGHA 

GAAATAATTGTCAGAA 


14 




TAATGAAGIAATTGGTG 


15 




CACCAATTACTTCAnA 


16 


Ser400Asn 
AGT-AAT 


AGAGTGGGCACAAACCA 

RATAATATTAAfifiRAAAT 

TTGGAATTCAGAAATGn 

CACTTCAATTACCCATCT 

CGAAAAGAAGHAAGGT 

AGAGTGATAAATGATTAA 

TCAACAATTAATCTA 


17 




TAGATTAATTGTTGAnA 
ATCAnTATCACTGTACC 
TTAACTTCrrnCGAGAT 
GGGTAATTGAAGTGAAC 
ATITCTGAATTCCAAAn 
TCGCnAATATTATCTGG 
TITGTGCCCACTCT 


18 




TCACnCAATTACCCAT 


19 




ATGGGTAAITGAAGTGA 


20 


Val801Met 
GTG-ATG 


GGAGCTGAGAGTCTGAT 

AAACAGCinAAGGTAAT 

AAAATCAnTTCTGTGCC 

ACAGGATATGAGTTGGT 

TTGATGACCCTAAAAACA 

CCACTGGAGCATTGACT 

AGCAGGCTCGCCAATG 


21 
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Table 35 

ABCB1 (MDR1) Targeting Oligos to Create Natural Alleles 



Allelic Variation 


Sequence of 
Targeting Oligos 


SEQ ID NO: 




CATTGGCGAGCCTGGTA 

GTCAATGCTCCAGTGGT 

GIIIIIA6GGTCATCAAA 

CCAACTCATATCCTGTG 

GCACAGAAAATGATTrTA 

TTACCTTAAAGCTGnTA 

TGAGACTCTCAGCTCC 


22 




CACAGGATATGAGnGG 


23 




CCAACTCAIATCCTGTG 


24 


lie829Val 

ATA rjTA 


AGCATGAGnGTGAAGA 

TAATATnTTAAAATTTCT 

ulAAl 1 lol 1 1 ifaTTTTG 

CAGGCTGTAGGTTCCAG 

GCnGCTGTAATTACCCA 

GAATATAGCAAATCnGG 

GACAGGAATAATTA 


25 




TAAnATTCCTGTGCCAA 

GATFTGCTATATTCTGGG 

TAATTACAGCAAGCCTG 

GAACGTACAGCGTGCAA 

AACAAAACAAATTAGAGA 

AATTTTAAAAATAnATCT 

TCACAACTCATGCT 


26 




TGCAGGCTGTAGGnCC 


27 




GGAAGGTAeAGCCTGCA 


28 


Ser893Aia 
TCT-GCT 


GTTGTTGAAATGAAAATG 

nGTCTGGACAAGCACT 

GAAAGATAAGAAAGAAC 

TAGAAGGTGGTGGGAAG 

GTGAGTCAAACTAAATAT 

GATTGATTAAnAAGTAG 

AGTAAAGTATTCTAAT 


29 
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Table 35 

ABCB1 (MDR1) Targeting Oligos to Create Natural Alleles 



Allelic Variation 


Sequence of 
Targeting Oligos 


SEQ ID NO: 




AnAGAATACTTTACTCT 

ACTTAATTAATCAATCAT 

ATTTAGinGACTCACCT 

TCCCAGCACCrrCTAGTT 

CTiTCnATCTTTCAGTG 

CnGTCCAGACAACATH 

TCATTTCAACAAC 


30 




TAGAAGGTGCTGGGAAG 


31 




CnCCCAGCACCnOTA 


32 


Ser893Thr 
TCT-ACT 


GnGHGAAATGAAAATG 

TTGTCTGGACAAGCACT 

GAAAGATAAGAAAGAAC 

TAGAAGGTACTGGGAAG 

GTGAGTCAAACTAAATAT 

GAnGATTAATTAAGTAG 

AGTAAAGTATTCTAAT 


33 




AnAGAATAClTTACTCT 

ACTTAAHAATCAATCAT 

ATTTAGTTTGACTCACCT 

TCCCAGTACCTTCTAGTT 

CTTTCTTATCTITCAGTG 

CnGTCCAGACAACATTT 

TCATITCAACAAC 


34 




TAGAAGGTACTGGGAAG 


35 




CTTCCCAGIACCTTCTA 


36 


Ala999Thr 
GCC-ACC 


TCAGCTGnGTCTITGGT 

GCCATGGCCGTGGGGC 

AAGTCAGnCAHTGCTC 

CTGACTATACCAAAGCC 

AAAATATCAGCAGCCCA 

CATCATCATGATCATTGA 

AAAAACCCCTITGAnG 


37 
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Table 35 

ABCB1 (MDR1) Targeting Oligos to Create Natural Alleles 



Allelic Variation 


Sequence of 
Targeting Oligos 


SEQ ID NO: 




CAATCAAAGGGGIIIIII 

CAATGATCATGATGATGT 

GGGCTGCTGATATTTTG 

GCTTTGGTATAGTCAGG 

AGCAAATGAACTGACn 

GCCGGACGGCCATGGCA 

CCAAAGACAACAGCTGA 


38 




CTGACTATACCAAAGCC 


39 




GGCnTGGTATAGTCAG 


40 


Gln1107Pro 
CAG-CCG 


GATCTGTGAACTCrrGTT 
TTGAGCTGGTTGATGGC 
AAAGAAATAAAGCGACT 
GAATGTrCCGTGGCTCC 

GTGTCCCAGGAGCCCAT 
CCTGITTGACTGCAGCA 
T 


41 




ATGCTGCAGTCAAACAG 

GATGGGCTCCTGGGACA 

CGATGCCCAGGTGTGCT 

CGGAGCCACGGAACATT 

CAGTCGCTTTAITTGnT 

GCCATCAAGCAGCTGAA 

AACAAGAGTTCACAGAT 

C 


42 




GAATGnGCGTGGCTCC 


43 




GGAGCCACGGAACAnC 


44 



Aliquots of the coisogenic cell collection are thereafter separately contacted 
with a variety of chemotherapeutic agents presently used for, or contemplated for use in, 
5 treatment of breast adenocarcinoma, and alleles that increase or decrease sensitivity to the 
cytotoxic effects of the agents are identified. 
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EXAMPLE 2 

Coisogenic Eukaryotic Cell Collections 
Having Natural Allelic Variants of CYP2D6 

Targeting oligos are used to create a cell collection coisogenic at the 
5 human CYP2D6 locus. 

The targeting oligonucleotides include terminal modifications as set forth 
above, including at least one phosphorothiate linkage, and are introduced in parallel into 
separate allquots of HBL100 cells using standard techniques. Potential cellular 
tranformants are propagated in vitro, cloned, and clonal cell lines having the desired 
1 0 targeted change identified by sequencing DNA amplified from the CYP2D6 locus. 

The targeting oligos have sequences (presented in Table 36, below) 
designed to create natural allelic variants of the CYP2D6 gene, creating a legacy-free, 
perfectly coisogenic cell collection in vt^hich the naturally occurring alleles of CYP2D6 are 
presented on the identical genetic background of a human breast epithelial cell line. 
1 5 The left-most column of the table Identifies the alteration that converts the 

wild type to the variant allele, at both the amino acid and the nucleotide level. At the amino 
acid level, mutations are presented according to the following standard nomenclature. The 
centered number identifies the position of the mutated codon in the protein sequence; to the 
left of the number is the wild type residue and to the right of the number is the mutant 
2 0 codon. At the nucleic acid level, the entire triplet of the wild type and mutated codons is 
shown. 

The middle column presents, for each alteration (mutation), four 
oligonucleotides capable of changing the wild type sequence site-specifically to the 
identified allelic variant. 

2 5 All oligonucleotides are presented, per convention, in the 5' to 3' 

orientation. The nucleotide that effects the change in the genome is underlined and 
presented in bold. 

The first of the four oligonucleotides for each mutation is a 121 nt 
oligonucleotide centered about the altering ("repair") nucleotide. The second 

3 0 oligonucleotide, its reverse complement, targets the opposite strand of the DNA duplex for 

change ("repair"). The third oligonucleotide is the minimal 17 nt domain of the first 
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oligonucleotide, also centered about the repair nucleotide. The fourth oligonucleotide is the 
reverse complement of the third, and thus represents the minimal 17 nt domain of the 
second. 

The third column of the table presents the SEQ ID NO: of the respective 
5 targeting oligonucleotide. 



Table 36 

CYP2D6 Targeting Oligos to Create Natural Alleles 



Allelic 
Variation 


Sequence of Targeting Oligos 


SEQ ID NO: 


Val7Met 
GTG-ATG 


GCCAGGTGTGTCCAGAGGAGCCCATTTGGTAGT 
GAGGCAGGTATGGGGCTAGAAGCACTGATGCCC 
CTGGCCGTGATAGTGGCCATCTTCCTGCTCCTGG 
TGGACCTGATGCACCGGCGCC 


45 




GGCGCCGGTGCATCAGGTCCACCAGGAGCAGGA 
AbATbGCCACTATCACGGCCAGGGGCATCAGTG 
CTTCTAGCCCCATACCTGCCTCACTACCAAATGG 
GCTCCTCTGGACACACCTGGC 


46 




AAGCACTGATGCCCCTG 


47 




CAGGGGCAICAGTGCTT 


48 


ValUMet 
GTG-ATG 


CAGAGGAGCCCATTTGGTAGTGAGGCAGGTATG 
GGGCTAGAAGCACTGGTGCCCCTGGCCATGATA 
GTGGCCATCnCGTGCTCCTGGTGGACCTGATGC 
ACCGGCGCCAACGCTGGGCTG 


49 




GAGCCCAGCGnGGCGCCGGTGCATCAGGTCCA 
CCAGGAGCAGGAAGATGGCCACTATCATGGCCA 
GGGGCACCAGTGCnCTAGCCCCATACCTGCCTC 
AGTACCAAATGGGGTCCTCTG 


50 




CCCTGGCCATGATAGTG 


51 




CAGTATCAIGGCCAGGG 


52 


Arg26His 
CGC-CAC 


TGGTGCCCCTGGCCGTGATAGTGGCCATCTTCCT 
GCTCCTGGTGGACCTGATGCACGGGCACCAACG 
CTGGGCTGCACGCTACCCACCAGGCCCCCTGCC 
ACTGCCCGGGCTGGGCAACCT 


53 
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Table 36 

CYP2D6 Targeting Oligos to Create Natural Alleles 


Allelic 

Variation 


Sequence of Targeting Oligos 


SEQ ID NO: 




AGGTTGCCCAGCCCGGGCAGTGGCAGGGGGCC 
TGGTGGGTAGGGTGCAGCCCAGCGTTGGTGCCG 
GTGCATCAGGTCCACCAGGAGCAGGAAGATGGC 
CACTATCACGGCCAGGGGCACCA 


54 




GCACCGGCACCAACGCT 


55 






56 


Arg28Cys 
CGC-TGC 


CCGGTGGCCGTGATAGTGGGCATCTTGCTGCTCG 
TGGTGGACCTGATGCACCGGCGCCAAIGCTGGG 
GTGGAGGCTACCCACCAGGCCCCCTGCCACTGC 
GCGGGCTGGGCAACCTGCTGC 


57 




GCAGCAGGTTGCCCAGCCCGGGCAGTGGCAGG 
GGGCCTGGTGGGTAGCGTGCAGGCCAGCAnGG 
CGCCGGTGCATCAGGTCCACCAGGAGCAGGAAG 
ATGGCCACTATCACGGCCAGGGG 


58 




GGCGCCAATGCTGGGCT 


59 




/AvJV./w\./r\VJV./M 1 1 wwVwwvr 


\J\J 


Pro34Ser 
CCA-TCA 


GGCATCTTCCTGCTCCTGGTGGACCTGATGCACC 

GGCGCCAACGCTGGGCTGCACGCTACTCACGAG 
GCCCCCTGCCACTGCGCGGGCTGGGCAACCTGG 
TGCATGTGGACTTGCAGAACA 


61 




TGTTCTGGAAGTCCACATGCAGCAGGTTGCCCAG 
CCCGGGCAGTG6CAGGGGGCCTGGTGAGTAGC 
GTGCAGCCCAGCGnGGCGCCGGTGCATCAGGT 
CCAGCAGGAGCAGGAAGATGGC 


62 




CACGCTACICACCAGGC 


63 




GCCTGGTGAGTAGCGTG 


64 


Gly42Arg 
GGG-AGG 


CTGATGCACCGGCGCCAACGCTGGGCTGCACGC 
TACCCACCAGGCCCCCTGCGACTGGGCAGGGTG 
GGCAAGCTGCTGCATGTGGACnCCAGAACACAC 
CATACTGCnCGACCAGGTGA 


65 
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Table 36 

CYP2D6 Targeting Oligos to Create Natural Alleles 


Allelic 
Variation 


Sequence of Targeting Oligos 


SEQ ID NO: 




TCACCTGGTCGAAGGAGTATGGTGTGTTCTGGAA 
GTCCACATGGAGCAGGTTGCCCAGCCIGGGCAG 
TGGCAGGGGGCCTGGTGGGTAGCGTGCAGCCCA 
GCGnGGCGCCGGTGCATCAG 


66 




CACTGCCCAGGCTGGGC 


67 




GCCCAGCCIGGGCAGTG 


68 


AlaSSVal 
GCG-GTG 

• 


TCGGGGACGTGTrCAGCCTGCAGCTGGCCTGGA 
CGCCGGTGGTCGTGCTCAATGGGCTGGIGGCCG 
TGCGCGAGGCGCTGGTGAGCGACGGCGAGGACA 
CCGCCGACGGCCCGCCTGTGCC 


69 




GGCACAGGCGGGCGGTCGGGGGTGTCCTCGCC 
GTGGGTCACCAGCGCCTCGCGCACGGCCACCAG 
CCCATrGAGCACGACCACCGGCGTCCAGGCCAG 
CTGCAGGCTGAACAGGTCCCCGA 


70 




TGGGCTGGIGGCCGTGC 


71 




GCACGGCCACCAGCCCA 


72 


Leu91l\/let 
CTG-ATG 


CTGCAGCTGGCCTGGACGGCGGTGGTCGTGCTC 
AATGGGCTGGGGGCCGTGCGCGAGGCGATGGT 
GACCCACGGCGAGGACACCGGCGAGGGCCCGG 
GTGTGGCCATGACCGAGATCCTGG 


73 




CGAGGATCTGGGTGATGGGCACAGGGGGGCGGT 
CGGCGGTGTGCTCGGCGTGGGTCACGATCGGCT 
CGCGCACGGCGGCCAGCCGATrGAGGAGGACCA 
CCGGCGTCCAGGCCAGCTGCAG 


74 




GCGAGGCGATGGTGACC 


75 




GGTCACCAIGGCCTCGC 


76 


His94Arg 
CAC-CGC 


GCTGGACGGCGGTGGTCGTGCTCAATGGGGTGG 
CGGCCGTGCGCGAGGCGCTGGTGAGCCGCGGC 
GAGGACACGGCCGAGCGCCCGCCTGTGGCCATC 
ACCCAGATCCTGGGTrTCGGGCC 


77 
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Table 36 

CYP2D6 Targeting Oligos to Create Natural Alleles 



Allelic 

Variation 


Sequence of Targeting Oligos 


SEQ ID NO: 




GGCCCGAAACCCAGGATCTGGGTGATGGGGACA 
GGCGGGCGGTCGGCGGTGTCGTCGCCGCGGGT 
CACCAGCGCCTCGGGCACGGCCGGGAGCCCATT 
GAGCACGACCACCGGCGTCCAGG 


78 




fiGTGACCCGCGGCGAGG 


79 






80 


Thr107lle 
ACC-ATC 


TGCGCGAGGCGCTGGTGACCCAGGGCGAGGACA 
CCGCCGACCGCCCGCCTGTGGCCATCAICCAGA 
TCCTGGGTrrCGGGCCGCGTTCCCAAGGCAAGC 
AGCGGTGGGGACAGAGACAGAT 


81 




ATCTGTCTCTGTCCCCACCGCTGCTTGCCTTGGG 
AACGCGGCCCGAAACCCAGGATCTGGATGATGG 
GCACAGGCGGGCGGTCGGCGGTGTCCTCGCCG 
TGGGTCACCAGCGCCTCGCGCA 


82 




GCCCATCATCCAGATCC 


83 




fir^ATCTGGATGATGGGC 


84 


Val136Met 
GTGiATG 


CCCCCAGGGGTGnCCTGGCGCGCTATGGGCCC 
GCGTGGCGCGAGCAGAGGCGCnCTCCATGTCC 
AGCTTGCGCAACnGGGCCTGGGCAAGAAGTCG 
CTGGAGCAGTGGGTGACCGAGG 


85 




CCTCGGTCAGCCACTGCTCCAGCGACnCTTGCC 
CAGGCCCAAGTrGCGCAAGGTGGACAIGGAGAA 
GCGCCTCTGCTCGCGCCACGCGGGCCCATAGCG 
CGGCAGGAACACCCCTGGGGG 


86 




GCnCTCCATGTCCACC 


87 




GGTGGACAIGGAGAAGG 


88 


Gln151Glu 

CAG-GAG 


CAGAGGCGCnCTCCGTGTCCACCTTGCGCAACT 
TGGGCCTGGGCAAGAAGTCGCTGGAGGAGTGGG 
TGACCGAGGAGGCCGCGTGCCnTGTGCCGCCT 
TGGCGAAGCACTCCGGTGGGT 


89 
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Table 36 

CYP206 Targeting Oligos to Create Natural Alleles 


Allelic 
Variation 


Sequence of Targeting Oligos 


SEQ ID NO: 




ACCCACCGGAGTGGTTGGCGAAGGCGGCACAAA 
GGCAGGCGGCCTCCTCGGTGACCCACTCCTCCA 
GCGACTTCTTGCCCAGGCCCAAGTTGCGCAAGGT 
GGAGACGGAGAAGCGCCTCTG 


90 




CGCTGGAG^GTGGGTG 


91 




CACCCACTCCTCCAGCG 


92 


Asn166Asp 
AAC-GAC 


AAGAAGTCGCT6GAGCAGTGGGTGACCGAG6AG 
GCCGCCTGCCTTTGTGCCGCCTTCGCCGACCACT 
CCGGTGGGTGATGGGCAGAAGGGGACAAAGCGG 
GAACTGGGAAGGCGGGGGACG 


93 




CGTCCCCCGCGTTCCCAGTTCCCGCTnGTGCCC 
nCTGCCCATCACGGACCGGAGTGGTCGGCGAA 
GGGGGCAGAAAGGCAGGCGGGCTCGTCGGTGAG 
CCACTGCTCGAGGGACTTCTT 


94 




CCTTCGCGGACCAGTCG 


95 




GGAGTGGTCGGCGAAGG 


96 


Gly169Arg 
GGA-AGA 


CTGGAGCAGTGGGTGACCGAGGAGGCGGCCTGC 
CTTTGTGCCGCGTTCGGCAACCACTCCAGTGGGT 
GATGGGCAGAAGGGGACAAAGGGGGAACTGGGA 
AGGCGGGGGACGGGGAAGGGG 


97 




CGGGnCCGGGTCGGCGGCCTTCGCAGTTGGCGG 
TTTGTGCCCnCTGCCCATCACCCACTGGAGTGG 
rrGGCGAAGGCGGCAGAAAGGCAGGCGGGGTCC 
TCGGTCACGCACTGCTCCAG 


98 




ACCACTCCAGTGGGTGA 


99 




TCACCGACIGGAGTG6T 


100 


Arg173Cys 
CGC-TGC 


AGGCGGGGGACGGGGAAGGCGACCCCnACCC 
GCATCTCCCACCCCCAGGACGGCCCTTTTGCCCC 
AACGGTCTGnGGACAAAGCCGTGAGCAAGGTGA 
TCGCCTCCCTCACGTGGGGGC 


101 
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Table 36 

CYP2D6 Targeting Oligos to Create Natural Alleles 



Allelic 
Variation 


Sequence of Targeting Oligos 


SEQ ID NO: 




GCCCGCAGGTGAGGGAGGCGATCACGTTGCTCA 
CGGCnTGTCCAAGAGACCGTTGGGGCAAAAGG 
GGCGTCCTGGGGGTGGGAGATGCGGGTAAGGG 
GTCGCCrrCCCCGTCCCCCGCCT 


102 




GCCCCTTTTGCCCCAAC 


103 




GTTGGGGCAAAAGGGGC 


104 


Arg201His 
CGC-CAC 


GCGTGAGCAACGTGATCGCCTCCCTGACCTGCG 
GGCGCCGGTTCGAGTACGACGAGCCTCACTTCCT 
CAGGCTGCTGGAGCTAGCTCAGGAGGGACTGAA 
GGAGGAGTCGGGCTTTCIGGG 


105 




GGCAGAAAGCCCGACTGCTCCTTCAGTCCCTCCT 
GAGCTAGGTCCAGCAGCCTGAGGAAGIGAGGGT 
CGTCGTACTCGAAGCGGCGCCCGGAGGTGAGGG 
AGGCGATCACGTrGCTCACGG 


106 




CGACCCTGACnCCTCA 


107 




TGAGGAAGTGAGGGTCG 


108 


Gly212Glu 
GGA-GAA 


GGCGCCGCTTCGAGTACGACGACCCTCGCTTCC 
TCAGGCTGCTGGACCTAGCTCAGGAGGAACTGA 
AGGAGGAGTGGGGCTTTCTGCGGGAGGTGCGGA 
GCGAGAGACCGAGGAGTCTCTG 


109 




CAGAGAGTGGTCGGTCTCTCGCTGCGCACCTGGC 
GCAGAAAGCCCGACTCGTCGnCAGTICCTCCTG 
AGCTAGGTCCAGCAGCCTGAGGAAGCGAGGGTC 
GTCGTAGTCGAAGCGGCGCC 


110 




TCAGGAGGAACTGAAGG 


111 




GCrrCAGTICCTCGTGA 


112 


Leu231Pro 
CTG-CCG 


GAGGAGGGATTGAGACCCGGTTCTGTCTGGTGTA 
GGTGGTGAATGGTGTCGCGGTCGTCGCGCATATC 
CGAGCGGTGGCTGGCAAGGTCCTACGGTTCCAAA 
AGGGirrGGIGACCGAGGT 


113 
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Table 36 

CYP2D6 Targeting Oligos to Create Natural Alleles 



Allelic 
Variation 


oeqiience or i argeung uiigos 


ScQ ID NO: 




AGCTGGGTCAGGAAAGCCnTTGGAAGCGTAGG 
ACCnGCCAGCCAGCGCTGGGATATGCGGGAGG 
ACGGGGACAGCATTCAGCACCTACACCAGACAGA 
ACGGGGTCTCAATCCGTCCTG 


114 




CGTCCTCCCGCATATCC 


115 




GGATATGC6GGAGGACG 


116 


Ala237Ser 
GCT-TCT 


CCGTTCTGTCTGGTGTAGGTGCTGAATGCTGTCC 
CCGTCCTCCTGCATATCCCAGCGCTGTCTGGCAA 
GGTCCTACGCTTCCAAAAGGCTrrCCTGACCCAG 
CTGGATGAGCTGCTAACTG 


117 




CAGTTAGCAGCTCATCCAGCTGGGTCAGGAAAGC 
CmTGGAAGCGTAGGACCTTGCCAGACAGCGCT 
GGGATATGCAGGAGGACGGGGACAGCATTGAGC 
ACCTACACCAGACAGAACGG 


118 




CAGCGCTGICTGGCAAG 


119 




CTTGCCAGACAGCGCTG 


120 


Arg296Cys 
CGC-TGC 


GCTCTCGGCCCTGCTCAGGCCAAGGGGAACCCT 
GAGAGCAGCTTCAATGATGAGAACCTGTGCATAG 
TGGTGGCTGACCTGTTCTCTGCGGGGATGGTGA 
CCACCTCGACCACGCTGGGGT 


121 




AGGCCAGCGTGGTCGAGGTGGTCACCATCCCGG 
CAGAGAACAGGTCAGCCACCACTATGCACAGGTT 
CTCATCATrGAAGCTGCTCTGAGGGTTCCCCTTG 
GCCTGAGCAGGGCCGAGAGC 


122 




AGAACCTGIGGATAGTG 


123 




CAGTATGCACAGGTTGT 


124 


lle297Leu 
ATA-CTA 


GTCGGCCCTGCTCAGGCCAAGGGGAACCCTGAG 
AGCAGCTTCAATGATGAGAACCTGCGCCTAGTGG 
TGGCTGACCTGTTCTCTGCCGGGATGGTGACCAC 
CTCGACCACGCTGGCCTGGG 


125 
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Table 36 

CYP2D6 Targeting Oligos to Create Natural Alleles 


Allelic 
Variation 


Sequence of Targeting Oligos 


SEQ ID NO: 




CCCAGGCCAGCGTGGTCGAGGTGGTCACCATGC 
CGGCAGAGAACAGGTCAGCCACCACTAGGCGCA 
GGTTCTGATCATTGAAGCTGCTGTCAGGGnCCC 
CTTGGCGTGAGCAGGGCCGAG 


126 




ACCTGCGGCTAGTGGTG 


127 




CACCACTAGGCGGAGGT 


128 


Ala300Gly 
GCG-GGT 


CTCAGGGGAAGGGGAACCGTGAGAGCAGCnCA 
ATGATGAGAACCTGCGCATAGTGGTGG6TGACCT 
GTTCTCTGCCGGGATGGTGACCACCTCGACCAC 
GCTGGCCT6GGGCCTCCTGCT 


129 




AGCAGGAG6CCCCAGGCCA6CGTGGTCGAGGT 
GGTCACCATCCCGGCAGAGAACAGGTCACCCAC 
CACTATGCGCAGGTTCTCATCAnGAAGCTGCTC 
TCAGGGTTCCCCTTGGCCTGAG 


130 




AGTGGTGG6TGACCTGT 


131 




ACAGGTCACCCACCACT 


132 


Asp301Asn 
GAC-AAC 


CAGGCCAAGGGGAACCCTGAGAGCAGCTTCAAT 
GATGAGAACCTGCGCATAGTGGTGGCTAACCTGT 
TCTCTGCCGGGATGGTGACCACCTCGACCACGCT 
GGCCTGGGGCCTCCTGCTCA 


133 




TGAGCAGGAGGCGCCAGGCCAGCGTGGTCGAG 
GTGGTCAGCATCCCGGCAGAGAACAGGTTAGCC 
ACCACTATGCGCAGGnCTGATCATTGAAGCTGG 
TCTCAGGGTTCCCCnGGCCTG 


134 




TGGTGGGTAACCTGTTC 


135 




GAACAGGTIAGCCACCA 


136 


Ser311Leu 
TCG-HG 


ATGATGAGAACCTGCGCATAGTGGTGGCTGACCT 
GnCTCTGGCGGGATGGTGACGAGCTTGAGCAGG 
GTGGCCTGGGGCGTCCTGCTCATGATCCTACATC 
CGGATGTGCAGCGTGAGCC 


137 
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Table 36 

CYP2D6 Targeting Oligos to Create Natural Alleles 


Allelic 
Variation 


Sequence of Targeting Oligos 


SEQ ID NO: 




GGCTGACGGTGCAGATCCGGATGTAGGATCATGA 
GCAGGAGGCCCCAGGGCAGCGTGGTCAAGGTG 
GTCACCATCCCGGCAGAGAACAGGTCAGGCACC 
ACTATGCGCAGGTTCTGATCAT 


138 




GACCACCTIGACCACGC 


139 




GCGTGGTCAAGGTGGTG 


140 


Hls324Pro 
CAT-CCT 


CTGCCGGGATGGTGACCACCTCGACCACGCTGG 
CCTGGGGCCTGCTGCTCATGATCCTACCTCCGGA 
TGTGCAGCGTGAGCCCATCTGGGAAACAGTGCA 
GGGGGCGAGGGAGGAAGGGTA 


141 




TAGCGTrCCTCGCTCGGCCCCTGCAGTGTiTCCC 
AGATGGGCTCACGGTGGACATCCGGAGGTAGGA 
TCATGAGCAGGAGGCCCCAGGCCAGCGTGGTCG 
AGGTGGTCACCATCCCGGCAG 


142 




GATCCTACCTCCGGATG 


143 




CATCCGGAGGTAGGATG 


144 


Pro325Leu 
CC6-CTG 


CCGGGATGGTGACCACCTGGAGCACGCTGGGCT 
GGGGCCTCCTGCTCATGATCCTACATCTGGATGT 
GGAGCGTGAGCCCATCTGGGAAACAGTGGAGGG 
GCCGAGGGAGGAAGGGTACAG 


145 




CTGTACCCTTGCTCCCTCGGCCCCTGCACTGTTT 
CCCAGATGGGCTCACGCTGCACATCCAGATGTAG 
GATCATGAGCAGGAGGCCCCAGGCCAGCGTGGT 
CGAGGTGGTCACCATCCCGG 


146 




CCTACATQIGGATGTGC 


147 




GCACATCCAGATGTAGG 


148 


Val338l\4et 
GTG-ATG 


TGCTGACCCATTGTGGGGACGCATGTCTGTCCAG 
GCCGTGTCCAACAGGAGATCGACGACATGATAG 
GGCAGGTGCGGCGACCAGAGATGGGTGACCAG 
GCTCACATGCCCTACACCACTG 


149 
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Table 36 

CYP2D6 Targeting Oligos to Create Natural Alleles 


Allelic 
Variation 


Sequence of Targeting Oligos 


SEQ ID NO: 




CAGTGGTGTAGGGCATGTGAGCCTGGTCACCCAT 
CTCTGGTCGCCGCACCTGCCCTATCAIGTCGTCG 
ATGTCCTGnGGACACGGCCTGGACAGACATGCG 
TCCCCACAATGGGTCAGCA 


150 




TCGACGACATGATAGGG 


151 




nCCTATCATGTCSTCGA 


152 


Arg343Gly 
CGG-GGG 


GGGACGGATGTCTGTCCAGGCCGTGTCCAACAG 
GAGATCGAGGACGTGATAGGGCAGGTG6GGCGA 
GCAGAGATGGGTGAGCAGGCTGAGATGGGGTACA 
CCACTGCCGTGATTGATGAGG 


153 




GCTCATGAATGACGGCAGTGGTGTAGGGCATGTG 
AGCGTGGTGAGCGATCTGTGGTGGGGCGAGGTGG 
GCTATGAGGTGGTGGATGTGGTGTTGGAGACGGG 
CTGGACAGAGATGGGTGGG 


154 




GGCAGGTGGGGCGACCA 


155 




TCGTCGCCCCACCTGCC 


156 


Arg365His 
CGC-CAG 


CAGAGATG6GTGAGGAGGGTGAGATGCGCTAGAG 
GAGTGGGGTGATTGATGAGGTGGAGCAGTTTGGG 
GAGATCGTCGGCCTG6GTGTGAGCGATATGAGAT 
CCCGTGAGATGGAAGTACA 


157 




TGTAGTTGGATGTGACGGGATGTCATATGGGTGA 
GAGGGAGGGGGAGGATGTGGGGAAAGTGGTGCA 
GGTCATGAATCACGGCAGTGGTGTAGGGCATGTG 
AGGGTGGTCAGGGATCTGTG 


158 




GGTGGAGGAGTTTGGGG 


159 




CGGGAAAGTCGIGCAGG 


160 


lle369Thr 
ATC-ACC 


ACGAGGGTCAGATGCCGTAGAGCACTGGCGTGAT 
TGATGAGGTGGAGGGCITTGGGGACACGGTGCG 
GGTGGGTGTGAGCGATATGACATGGGGTGAGATG 
GAAGTACAGGGGTTGCGGAT 


161 
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Table 36 

CYP2D6 Targeting Oligos to Create Natural Alleles 



Allelic 
Variation 


wtsi|uviiuv %jj 1 argeufig wiigos 


9BI4 ID NU! 




ATGCGGAAGCCCTGTACTTCGATGTCACGGGATG 
TGATATGGGTCACACCCAGGGGGACG6TGTCGC 
CAAAGCGCTGGACCTCATGAATCACGGCAGTGGT 
GTAGGGCATGTGAGCCTGGT 


162 




TGGGGAGACCGTGCCCC 


163 




GGGGGACGfiTGTCCCCA 


164 


Gly373Ser 
GGT-AGT 


ATGGCCTACACCACTGCCGTGATTCATGAGGTGC 
AGCGCTTTGGGGACATCGTCCCCGTGAGTGTGA 
CCCATATGACATCCCGTGAGATCGAAGTACAGGG 
CnCCGCATCCCTAAGGTAG 


165 




CTACCTTAGGGATGCGGAAGCGGTGTACTTCGAT 
GTCACGGGATGTCATATGGGTCACACTCAGGGG 
GAGGATGTCCGCAAAGCGGTGCAGCTGATGAATC 
ACGGCAGTGGTGTAGGGCAT 


166 




TCCCCCTGAGTGTGACC 


167 




GGTCACACICAGGGGGA 


168 


Val374Met 
GTG-ATG 


CCCTACACCACTGGCGTGAnCATGAGGTGCAGC 
GCTTTGGGGACATCGTCCCCCTGGGTATGACGCA 
TATGACATCCCGTGACATCGAAGTACAGGGCTTG 
CGCATCCCTAAGGTAGGCC 


169 




GGCCTACCTTAGGGATGCGGAAGCCCTGTACnC 
GATGTCACGGGATGTCATATGGGTCATACCGAGG 
GGGACGATGTCCCGAAAGCGCTGGAXTGATGAA 
TCAGGGCAGTGGTGTAGGG 


170 




CCCTGGGTATGACGCAT 


171 




ATGGGTCAIACCCAGGG 


172 


Glu410Lys 
GAG-AAG 


GCCCAGGGAAGGACACTGATCACCAACCTGTCAT 
CGGTGCTGAAGGATGAGGCCGTCTGGAAGAAGC 
GCTTCCGGTTCCAGCCCGAACAGTTCCTGGATGG 
GCAGGGCCACTTTGTGAAGC 


173 
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Table 36 

CYP2D6 Targeting Oligos to Create Natural Alleles 


Allelic 
Variation 


Sequence of Targeting Oligos 


SEQ ID NO: 




GCnCACAAAGTGGCCCTGGGCATCCAGGMGT 
GnCGGGGTGGAAGCGGAAGGGCTTCTTCCAGA 
GGGCCTCATCCTTGAGCACCGATGACAGGTTGGT 
GATGAGTGTCGTTCCCTGGGG 


174 




CCGTCTGGAAGAAGCCC 


175 




GGGCTTCTICCAGACGG 


176 


Glu418Gln 
GAA-CAA 


AACCTGTCATCGGTGCTGAAGGATGA6GCCGTCT 
GGGAGAAGCCCTTCCGCnCCACCCCCAACACn 
CCTGGATGCCCAGGGGCACTrTGTGAAGCCGGA 
GGCCTTCCTGCCTTTCTCAG 


177 




CTGAGAAAGGCAGGAAGGCCTCCGGCnCACAA 
AGTGGCCCTGGGCATCCAGGAAGTGnGGGGGT 
GGAAGCGGAAGGGCTTCTCCCAGACGGCCTCAT 
CCTTCAGCACCGATGACAGGTT 


178 




TCCACCCCCAAGACTTC 


179 




GAAGTGTTGGGGGTGGA 


180 


Leu421PrD 
CTG-CCG 


CGGTGCTGAAGGATGAGGCCGTCTGGGAGAAGC 
GGTTCCGCTTCCACCCCGAACACnCCCGGATGC 
CCAGGGCCACnTGTGAAGCCGGAGGCCnCCT 
GCCTTTCTCAGCAGGTGCCTG 


181 




CAGGCACCTGCTGAGAAAGGCAGGAAGGCCTCC 
GGCnCACAAAGTGGCCCTGGGCATCCGGGAAG 
TGnCGGGGTGGAAGCGGAAGGGCTTCTCCCAG 
ACGGCCTCATCCnCAGCACCG 


182 




ACACnCCfiGGATGCGC 


183 




GGGCATCC6GGAAGTGT 


1W 


Arg440Hls 
CGC-CAC 


TCTTGCAGGGGTATCACGCAGGAGCCAGGCTCA 
CTGACGCGCCTGCCCTCCCCACAGGCCACCGTG 
CATGGCTCGGGGAGCCCCTGGGCCGCATGGAGC 
TCTTCCTCTTCTTCACCTGCCT 


185 
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Table 36 

CYP2D6 Targeting Oligos to Create Natural Alleles 


Allelic 
Variation 


Sequence of Targeting Oligos 


SEQ ID NO: 




AGGGAGGTGAAGAAGAGGAAGAGCTCCATGCGG 
GCCAGGGGCTCCCCGAGGCATGCACGGTGGCCT 
GTGGGGAGGGGAGGGGCGTCAGTGAGCCTGGC 
TCCTGGGTGATACCCGTGCAAGA 


186 




CAGAGGCCACCGTGCAT 


187 




ATGCAC6GIGGCCTGTG 


188 


Met451lle 
ATG-ATA 


TGACGCCCCTCCCCTCCCCACAGGCCGCCGTGC 
ATGCCTCGGGGAGCCCCTGGGCCGCATAGAGCT 
CTTCGTCTrCTrCACCTCCCTGCTGCAGCACTTCA 
GCTTCTCGGTGGCCACTGGA 


189 




TCCAGTGGGCACCGAGAAGCTGAAGTGCTGCAG 
CAGGGAGGTGAAGAAGAGGAAGAGCTCTATGCG 
GGCGAGGGGCTCCCCGAGGa\TGCACGGCGGC 
CTGTGGGGAGGGGA6GGGCGTGA 


190 




GCCCGCATAGAGCTCn 


191 




AAGAGCTCIATGCGGGC 


192 


Ser486Thr 
AGC-ACC 


TCTCGGTGCCCACTGGACAGCCCCGGCCCAGCC 
ACCATGGTGTCTTTGCnTCCTGGTGACCCCATC 
CCCCTATGAGCTTTGTGCTGTGCCGGGCTAGAAT 
GGGGTACCTAGTCCCCAGCC 


193 




GGCTGGGGACTAGGTACCCCATTCTAGCGGGGC 
ACAGCAGAAAGCTCATAGGGGGATGGGGTCACC 
AGGAAAGCAAAGACACCATGGTGGCTGGGCCGG 
GGCTGTCCAGTGGGCACGGAGA 


194 




CCTGGTGAfiCCCATCCC 


195 




GGGATGGGGTCACCAGG 


196 



5 Aliquots of the coisogenic cell collection are thereafter separately contacted 

with a variety of chemotherapeutic agents presently used for, or contemplated for use in, 
treatment of breast adenocarcinoma, and alleles that increase or decrease sensitivity to the 
cytotoxic effects of the agents are identified. 
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All patents, patent publications, and other published references mentioned 
herein are hereby incorporated by reference in their entireties as if each had been 
individually and specifically Incorporated by reference herein. While preferred illustrative 
5 embodiments of the present Invention are described, one sidlled in the art will appreciate 
that the present invention can be practiced by other than the described embodiments, which 
are presented for purposes of illustration only and not by way of limitation. The present 
invention Is limited only by the claims that follow. 
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What is claimed is: 

1 . A collection of cultured cells, comprising: 
at least 5 genotypically distinct cells, 

wherein each of said at least 5 genotypically distinct cells is coisogenic with 
respect to the others of said at least 5 genotypically distinct cells at a target locus common 
thereamong, and 

wherein each of said at least 5 genotypically distinct cells can he separately 

assayed. 



2. The cell collection of claim 1 , comprising at least 10 genotypically 

distinct cells. 



3. The cell collection of daim 2, comprising at least 25 genotypically 

distinct cells. 



4. The cell population of any one of claims 1 - 3, wherein said cells 
are mammalian cells. 

5. The cell population of claim 4, wherein said mammalian cells are 

human cells. 



6. The cell population of claim 4, wherein said mammalian cells are 

rodent cells. 



7. The cell population of claim 6, wherein said rodent cells are mouse 

cells. 



8. The cell population of any one of claims 1 - 3, wherein said cells 

are yeast cells. 
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9. The cell population of any one of claims 1 - 3, wherein said cells 

are plant cells. 

1 0. The cell collection of any one of claims 1 - 9. wherein each of said 
genotypically distinct cells is disposed in fluid noncommunication with each of the other of 
said genotypically distinct cells. 

11. The cell collection of claim 1 0, wherein each of said genotypically 
distinct cells is spatially addressable. 

1 2. The cell collection of any one of claims 1-11, wherein said 
genotypically distinct cells collectively include each of the 20 natural amino acids at a single 
residue encoded at the target locus. 

13. The cell collection of any one of claims 1-12, wherein said 
genotypically distinct cells collectively include a predetermined amino acid at each residue 
encoded after the initiator methionine at the target locus. 

14. The cell collection of any one of claims 1-13, wherein said 

genotypically distinct cells collectively include at least one naturally occum'ng allele of the 

target locus. 

1 5. The cell collection of claim 1 4, wherein said genotypically distinct 
cells collectively include a plurality of naturally occurring alleles of the target locus. 

1 6. The cell collection of any one of claims 1-15, wherein said 
genotypically distinct cells further comprise a common selectable marker at a genomic locus 
different from said target locus. 
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17. The cell cx)llectjon of any one of daims 1-16, wherein said 
genotypically distinct cells each further comprises a marker unique to said genotypically 
distinct cell, said mar1<er l)eing at a locus different from said target locus. 

1 8. The cell collection of any one of claims 1-17, wherein said target 
locus is selected from the group consisting of: CYP1A2, CYP2C17, CYP2D6, CYP2E, 
CYP3A4, CYP4A11, CYP1B1, CYP1A1. CYP2A6, CYP2A13, CYP2B6, CyP2C8, CYP2C9. 
CYP11A, CYP2C19. CYP2F1. CYP2J2. CYP3A5, CYP3A7, CYP4B1. CYP4F2, CYP4F3, 
CYP6D1, CYP6F1, CYP7A1, CYP8, CYP11A. CYP11B1, CYP11B2 , CYP17, CYP19, 
CYP21A2, CYP24, CYP27A1, CYP51. ABCB1, ABCB4, ABCC1, ABCC2, ABCC3. ABCC4. 
ABCC5, ABCC6. MRP7, ABCC8, ABCC9. ABCC10, ABCC11, ABCC12, EPHX1. EPHX2, 
LTA4H, TRAG3, GUSB, TMPT. BCRP. HERG, hKCNE2, UDP glucuronosyl transferase 
(UGT), sulfotransferase, sulfatase, glutathione S-transferase (GST) -alpha, glutathione S- 
transferase -mu, glutathione S-transferase -pi, ACE, and KCHN2. 

1 9. The cell collection of claim 18, wherein said target locus is ABCB1 . 

20. TTie cell collection of any one of claims 1-19, wherein said 
coisogenic cells are legacy-free. 

21 . The cell collection of any one of claims 1 - 20, wherein said 
coisogenic cells are exceptionally coisogenic. 

22. The cell collection of any one of claims 1 - 21 . wherein said 
coisogenic cells are perfectly coisogenic. 

23. A kit, comprising: 

at least five genotypically distinct cells, said cells contained within separate, 
structurally discrete, fluidly noncommunicating containers, wherein each of said at least 5 
genotypically distinct cells is coisogenic with resped the others of said at least 5 
genotypically distinct cells at a target locus common thereamong; 
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wherein said at least five structurally discrete containers are commonly 

packaged. 

24. The kit of claim 23, wherein said at least five genotypically distinct, 
commonly packaged, cells constitute a coisogenic cell collection according to any one of 
claims 1 - 22. 

25. The kit of claim 23 or claim 24, further comprising: 

a computer readable medium, said computer readable medium containing 
a dataset that describes the target Ixus genotype of each of said genotypically distinct 
cells. 

26. A method of making a coisogenic cell collection, the method 

comprising: 

collecting at least 5 genotypically distinct cells, each of said genotypically 
distinct cells being coisogenic with respect to the others of said at least 5 genotypically 
distinct cells at a target locus common thereamong, into a collection in which each of said at 
least 5 genotypically distinct cells can be separately assayed. 

27. The method of claim.26, further comprising the antecedent step of: 
engineering, into at least four of said at least five cultured cells, said cells 

having derived from a common eukaryotic ancestor cell, a genomic sequence alteration at a 
target locus common thereamong, said sequence alterations being sufficient to cause at 
least five distinct protein sequences collectively to be encoded by said cells at said target 
locus. 

28. The method of claim 27, wherein said engineering Is effected by 
introducing a targeting oligonucleotide into each of said at least four cultured cells. 

29. The method of claim 27, wherein said engineering step is effected 
by introducing into each of said at least four cultured cells a recombination-competent 
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substrate into which said genomic sequence alteration has previously been intrxxluced 
using a tai^eting oligonucleotide. 

30. A kit, comprising: 

at least four targeting oligonucleotides of distind sequence; and 
a eukaryotic ceil, 

wherein said oligonucleotides are sufficient for use in the method of claim 
28 to create the cell collections of any of claims 1 - 22 from said eukaryotic cell. 

31 . A method of identifying genotypes of a target locus that alter a 
cellular phenotype, comprising: 

assaying each genotypically distinct cell of a coisogenic cell collection for a 
common phenotypic characteristic, wherein said genotypically distinct cells are coisogenic 
at said target locus, and wherein said collection Is a coisogenic cell collection according Id 
any one of claims 1 -22; 

identifying from said assay results at least one cell having an altered 
phenotypic characteristic; and 

con-elating, for at least said at least one cell with altered phenotypic 
characteristic, the results of said phenotypte assay with saki celFs target locus genotype, 

the correlation of phenotypic assay results with target locus genotype 
Wentlfying genotypes of said target locus that alter said cellular phenotype. 

32. The mettiod of claim 31 , wherein said phenotypic characteristic is 
responsiveness of said cell to a xenobiotic. 

33. The method of daim 31 or daim 32, further comprising the 
antecedent step of: 

contacting said coisogenic cell collection with a xenobiotic. 

34. The method of any of claims 31 - 33, wherein said target locus is 
selected from the group consisting of: CYP1A2, CYP2C17, CYP2D6, CYP2E, CYP3A4, 
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CYP4A11. CYP1B1. CYP1A1. CYP2A6. CYP2A13. CYP2B6. CYP2C8, CYP2C9, CYP11A, 
CYP2C19. CYP2F1. CYP2J2. CYP3A5, CYP3A7, CYP4B1, CYP4F2, CYP4F3, CYP6D1. 
CYP6F1, CYP7A1, CYP8, CYP11A, CYP11B1. CYP11B2 , CYP17. CYP19, CYP21A2. 
CYP24. CYP27A1, CYP51, ABCB1, ABCB4, ABCC1, ABCC2, ABCC3. ABCC4, ABCC5, 
ABCC6. MRP7, ABCC8. ABCC9, ABCC10, ABCC11, ABCC12. EPHX1. EPHX2. LTA4H. 
TRAG3, GUSB, TMPT. BCRP, HERG. hKCNE2, UDP glucuronosyl transferase (UGT). 
sulfotransferase, sulfatase, glutathione S-transferase (GST) -alpha, glutathione S- 
transferase -mu, glutathione S-transferase -pi, ACE, and KCHN2. 

35. The method of any one of claims 31 - 34, further comprising the 
step, after said conreiating, of: 

collecting said congelations into at least one dataset. 

36. The method of claim 34, wherein said dataset is recorded on a 
computer-readable medium. 

37. A method of predicting a phenotypic characteristic of a cell based 
upon* its genotype at a target locus, comprising: 

using said cell's genotype at said target locus, or a unique identifier thereof, 
as a query to retrieve from a dataset data that report a correlated phenotypic characteristic, 

wherein said dataset includes correlations of a phenotypic characteristic 
with target locus genotype for at least five cells that are coisogenic at said target locus, 

said retrieved phenotypic characteristic providing a prediction of said cell's 
phenotypic characteristic. 

38. The method of claim 37, wherein said at least five cells that are 
coisogenic at said target locus genotype are a cell collection according to any one of claims 
1-22. 

39. The method of claim 37 or claim 38, wherein said target locus is 
selected from the group consisting of: CYP1A2, CYP2C17, CYP2D6, CYP2E, CYP3A4, 
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CYP4A11, CYP1B1. CYP1A1. CYP2A6, CYP2A13, CYP2B6, CYP2C8. CYP2C9, CYP11A, 
CYP2C19, CYP2F1. CYP2J2, CYP3A5, CYP3A7. CYP4B1, CYP4F2. CYP4F3, CYP6D1. 
CYP6F1, CYP7A1. CYP8, CYP11A, CYP11B1, CYP11B2 , CYP17, CYP19, CYP21A2, 
CYP24, CYP27A1, CYP51. ABCB1. ABCB4, ABCC1, ABCC2, ABCC3, ABCC4, ABCC5, 
ABCC6, MRP7, ABCC8, ABCC9, ABCC10. ABCC1 1, ABCC12. EPHX1. EPHX2, LTA4H, 
TRAG3, GUSB, TMPT, BCRP, HERG, hKGNE2, UDP glucuronosyl transferase (UGT). 
sulfotransferase, sulfatase, glutathione S-transferase (GST) -alpha, glutathione S- 
transferase -mu, glutathione S-transferase -pi, ACE. and KGHN2. 



